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PREFACE 


There is a well-known bit of advice for an author which says that preferably 
before beginning, but certainly before finishing a book, he should ask himself 
why he is writing it. This fundamental question and the author’s answer to it 
are of interest to the potential user of a mathematical text within the context 
of a web of related questions. Is this book really necessary? Does it have any 
novel features that distinguish it from other books ? For whom is it designed ? 
What are the prerequisites ? Where does it take the reader ? What is the subject 
matter ? How is the material organized and structured ? 

In recent years it has become fairly standard practice for colleges to offer 
a one-semester course in linear algebra at the sophomore level, the justification 
and purpose being the connection with certain topics normally covered in 
second year calculus. After teaching such a course on a number of occasions, 
with singular lack of success, I emerged with the strongly held view that not 
only was the justification for the course insufficient, but even more that the 
nature of the material was inappropriate for students at the given level of 
mathematical experience. Although the available texts were more than ade¬ 
quate, only the most gifted students were in a position to develop successfully, 
in the time-span of a single semester, the required combination of geometric 
insight, capacity for abstract reasoning, and technique of formal proof. The 
conclusion to be drawn, almost inevitably, was that the study of linear algebra 
should be deferred, reserving it for students with some mathematical maturity. 


IX 



Once the preceding line of argument is accepted, the problem to be ad¬ 
dressed is how to introduce the interested student to the spirit, style, and 
content of modern (some prefer to call it “ abstract”) algebra. There are many 
fine texts of this genre available, but they tend to be aimed at the junior level 
and are consequently not quite suitable for our purpose. The true beginner 
usually finds that they go too far too fast and are too abstract; he is unable to 
assimilate the many new definitions and concepts, and does not possess the 
mathematical sophistication to manipulate them in a meaningful way. 

Out of such considerations there evolved, over a period of several years, 
a book with the following characteristics: 

It is designed for a full year course at the freshman or the sophomore level. 
Since there are no prerequisites other than familiarity with the standard ele¬ 
mentary topics of high school mathematics-(especially algebra), this course 
may be taken before, after, simultaneously with, or even in place of first-year 
calculus. Therefore, this text can also be used for well-prepared and motivated 
high school seniors. 

The style is slow-moving, detailed, quite wordy, and occasionally repetitive, 
as dictated by a struggle to provide the student (insofar as space and other 
constraints permit) with a book he can read with a minimum of outside help. 
In this connection it may be usefully noted that preliminary versions of this 
text have been used, and found readable, by such diverse groups as: students 
of liberal arts intending to major in mathematics, students in a school of 
education preparing to become mathematics specialists, and high school 
mathematics teachers (in a NSF institute) anxious to enlarge their mathe¬ 
matical horizons by learning material that impinges upon and enriches the 
subject matter in their own curriculum. 

No attempt has been made to produce an encyclopedic book, or to throw 
masses of material at the student, or to probe deeply. Rather, I have tried to 
treat a small number of topics, selected, in the main, for their contribution 
toward the larger objective of studying algebraic questions arising out of num¬ 
ber theory, with care, precision of thought and expression, and a certain 
amount of thoroughness. 

Experience provides ample verification of the assumption that in teaching 
mathematics it is pedagogically essential to motivate concepts carefully, and 
to proceed from the familiar to the unfamiliar, from the specific to the general, 
from the concrete to the abstract. Adherence to these guidelines is a central 
feature of this book. Thus, we begin by studying the most familiar mathe¬ 
matical objects, the integers, and then use them to motivate, introduce, and 
illustrate some of the basic concepts of modern algebra. The underlying 
thread or connective tissue for the book is the interweaving of certain ques¬ 
tions of elementary number theory with the tools, techniques and concepts 
of modern, abstract algebra. Topics from linear algebra are consciously ex- 
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eluded since they would constitute a diversion from the main thrust of the 
book. 

Each section (with the exception of the first few, which are relatively short) 
centers on a single topic and develops it rather fully—thus the sections tend 
to be considerably longer than is customary. Roughly speaking, the material 
in each section is arranged according to increasing complexity and with de¬ 
creasing detail. Many examples are included, mostly of a numerical or com¬ 
putational nature. They serve to illustrate and reinforce the discussion in the 
text and to provide previews of coming attractions. Each section closes with 
a large number of problems, ranging from the purely mechanical to the 
theoretical and abstract. While knowledge of the problems is not needed for 
subsequent developments in the text, it is essential that the student work on 
lots of them, even if his success is limited. This is the best way really to under¬ 
stand and master the subject matter. Selected answers to the problems appear 
at the back of the book. At the end of each chapter, there is a collection of 
“miscellaneous problems.” These are usually of a more advanced type; some 
of them, if expanded fully, could make up an entire section in this, or any 
other, book. Only the very best students can be expected to make serious 
inroads in the miscellaneous problems. 

A few additional comments are in order. I have preferred to plunge right 
in and deal with mathematical questions instead of starting with a preliminary 
section on sets and functions. These are discussed at the appropriate time— 
namely, when they are needed. Some may consider the selection of topics 
perverse on the grounds of being too pedestrian or too difficult. I would merely 
observe that the amount of space devoted to a topic in the text is not neces¬ 
sarily a measure of its mathematical importance. Moreover, the precise con¬ 
tents, and the emphasis, of a given course are determined according to the 
personal tastes of the instructor. This book attempts to give him some leeway 
in his choices. 

It remains for me to express my deep appreciation to: Sandra Spinacci, 
who typed the manuscript with her usual skill and efficiency; Bill Adams, who 
read the manuscript carefully and found a number of errors; my family and 
S. Shufro, who provided encouragement and assistance; my students, who 
participated (unknowingly) in all kinds of pedagogical experiments; the people 
at Academic Press, individually and collectively, for their competence, cour¬ 
tesy, and unfailing cooperation. 




ELEMENTARY NUMBER THEORY 


In this chapter, we shall be concerned exclusively with the set of all integers 
—positive, negative, and zero. This set is denoted by 

Z={0, ±1, ±2, ±3,...} 

and whenever we write any of the symbols a, b, c, d >..., x, y, z ,... they shall 
represent elements of Z. 

It is taken for granted that we are familiar with the integers; however, in 
order not to complicate matters unnecessarily at the start, we choose to be 
imprecise as to exactly what we know about them. Of course, it is assumed that 
we know how to operate, or compute, with integers. As the discussion pro¬ 
gresses, additional rather obvious properties of the integers will be used, and 
we shall try to point these up explicitly at the appropriate time. Thus, at the 
current stage, our attitude toward the integers will be relatively informal as 
compared to the formal axiomatic approach to algebraic objects that will be 
adopted in later chapters. 
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l-L Divisibility 

It is assumed that the facts about addition, subtraction, and multiplication 
of integers are known; however, “ officially ” we do not recognize the existence 
of “division.” We consider only objects that belong to Z, so symbols like 
“ 5/2 ” or “ 2/5 ” are outside our realm of discourse. From this point of view, 
“fractions,” that is, symbols “aw/az,” where me Z (we read me Z as: m 
Is an element of Z, m Is In Z, or m belongs to Z) and ne Z have no meaning. 
Our purpose here is to begin the investigation of questions of “ divisibility ” in 
Z, without going outside of Z to do so. 

1-1-1. Definition. Given a, be Z (this notation asserts that both a and b 
are elements of Z) with 0; we say that a divides b (and write this 
symbolically as a | b) when there exists c e Z such that ac = b. There are 
alternate ways to express this—namely, a is a divisor of b, a is a factor of b 9 
b is a multiple of <z, b is divisible by a . 

If no such c exists-, we say that a does not divide b (or: a is not a divisor 
of b 9 a is not a factor of b 9 b is not a multiple of a, b is not divisible by a) 
and write a)(b. 

It should be noted that the statement a \ b includes the fact that a ^ 0; thus, 
according to the definition, 0 does not divide any integer. 

The definition of a \ b is concerned only with the theoretical question of 
the existence of the desired element c—it says nothing about how to decide, in 
practice, if such an integer c exists or how to find it. For example, suppose 
we wish to decide if a — 1 divides b = 91. The reader would settle this quickly 
by cheating—that is, by using long division. For us, unfortunately, the restric¬ 
tion on division remains in force (in particular, we do not know about long 
division), so we must somehow search through Z seeking an element c for 
which 7c = 91. By testing c = 1, 2, 3,..., in turn, we see eventually that 
7-13 = 91, so that 7191. 

In the same spirit, let us decide if a = 7 divides b = 99. Having observed 
that 7 • 13 = 91, we note further that 

7 • 14 = 98 < 99 < 7 • 15 = 105. 

Therefore, if c < 14, then 7c < 99, and if c > 15, then 7c > 99; it follows that 
7^99. 

The discussion above is designed to emphasize our avoidance of long 
division, and surely it would not be very exciting to test other pairs of integers 
a and b for divisibility. Instead, we turn to some of the immediate theoretical 
consequences of the definition of divisibility. 
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1-1-2. Facts. For elements of Z, the following properties hold: 

(/) a 10 for every a # 0. 

Proof : Here b — 0, so by taking c = 0we have ac = b. | 

(ii) If a | b and b | c, then a \ c . 

Proof : By hypothesis, there exist d,ee Z such that b = ad and c = fo. 
Therefore, c = a(rfe), so that de is an integer whose existence guarantees that 
a\c. | 

(Hi) (± 1) | a for all a, and (±a) | a for all a # 0. 

Proof : Both statements follow immediately from the equation a = (a)(1) = 

(-*)(-!). I 

(iv) The following four assertions are equivalent: 

(1) a\b 9 

(2) a\(-b), 

(3) {-a)\b y 

(4) (-a) | (-b). 

Proof : By equivalence of the four assertions, we mean that any two of 
them are equivalent. Of course, in general, two assertions are said to be 
equivalent when each one implies the other. Thus, equivalence of our four 
assertions amounts to saying that each assertion implies the other three—so 
there are a total of 12 implications to be proved. However, by proceeding 
“ cyclically,” it clearly suffices to prove only the following four implications 

(1)=>(2), (2) => (3), (3) => (4), (4) => (1)." 

The proofs of all four of these implications proceed the same way, so we con¬ 
tent ourselves with proving just one of them—namely, (2) => (3); the reader 
can easily provide proofs for the three remaining implications. 

When (2) is given we have a \ (—b), so there exists ce Z such that ac = —b. 
By the properties of “minus,” we have, therefore, b = (— a)(c). This says that 
( — a) | b and completes the proof that (2) => (3). | 

A common notation used to express all the equivalences just proved is 

a\b o a|(— b) o (— a)\b o (—a)j( — b) 

or simply, 

(1) o (2) o (3) o (4). 

One immediate, but important, consequence of these equivalences is that 
in considering questions of divisibility there is nothing lost in assuming that 
a > 0—for after all, a \ b o ( — a) \ b. 

(v) If a | b and a | c, then a \ (bx + cy) for all choices of x, y e Z. In particu¬ 
lar, if a | b and a | c, then a \ (b + c) and a \ (b — c ). 
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Proof: We must show that bx + cy can be written in the form a times an 
integer. The proof is straightforward. By hypothesis, we may write b — ad, 
c = ae, where d,ee Z. Therefore. 

bx + cy — adx + aey = a(dx + ey) 

which says that a\ (bx + cy). In particular, by making a specific choice of x 
and y —namely, x = l ,y =1—we see that a\(b + c). Similarly, by putting 
x= 1 , y = — 1 , we see that a | (b — c). | 

It is convenient to call any element of form bx +cy (or xb + yc), where 
x,ye Z, a linear combination of b and c. Note that the following numbers are 
(or more precisely, can be expressed as) linear combinations of b and c — b 
(take x = l,y =0 ), — c (take x = 0, y = — 1), 0 (take x = 0, y = 0), 2 b + 3c, 
3b — 5c = 3b-h( — 5)c,—3b + 5c, — With this terminology, the result 
proved above may be restated in the following form: If a divides both b and c, 
then it divides any linear combination of b and c; in particular, if a divides b 
and c, then it divides their sum and their difference. 

(vi) \a\ = \b\ o a — ±b. 

Proof : Many of us may well be prepared to consider this assertion as 
obvious, but a somewhat more convincing argument is required. For this, 
it is necessary to recall the definition of “ absolute value.” 

First of all, one defines 


|0| =0. 

Then, for a #0, one notes that The integers a and —a are distinct, with one 
of them positive and the other negative. (Naturally, we are assuming that the 
basic facts about positive and negative integers are known. It may also be 
noted in passing that 0 is, by common usage, neither positive nor negative.) 
One then defines, for a # 0, 

\a\ = the positive member of the set {a, —a }. 

Thus, \a\ is either a or —a, and we “abbreviate” this by writing \a\ = ±a. Of 
course, this definition of absolute value is the same as the one commonly used 
for real numbers; some of*the properties of absolute value are given in the 
problems—see 1-1-3, Problem 7 . 

Returning to the proof, let us suppose that a = ±b —that is, a equals plus 
or minus b. The trivial case occurs when a — b = 0, as it is then immediate 
that \a\ = \b\ — 0. Upon discarding the trivial case, we have both a # 0 and 
b ^ 0. In this case, a = ±b and also — a = ±b; so it follows that {a, —a} = 
{ b , —b} —meaning that the two sets {a, —a} and {b, —b), each of which 
consists of two distinct elements, are identical. The definition of absolute 
value now says that \a\ = |&|—thus proving the implication <=. 
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Conversely, to prove the implication =>, let us suppose that \a\ = |i|. 
Excluding the trivial case where \a\ = |Z>| = 0 (as then a = b = 0), we may 
assume that \a\ — |Z>| ^ 0, so a # 0 and b # 0. Since \b\ = ±b we have 
\a\ = ±b 9 which says that \a\ belongs to the two element set {b , — b}. Hence, 
the positive member of {<a , —a} is equal to the positive member of {b , — b }. 
There are four possibilities for the two positive members—namely, a and b , 
a and — b 9 —a and 6, —a and — b —and in each case, a = ±b. This completes 
the proof. | 

(vii) If a | b and b ^ 0, then |a| < |6|. 

Proof : By hypothesis there exists ce Z such that 6 = ac, and c # 0 
because 6 # 0. Taking absolute values yields |6| = \ac\ = \a\ • |c|, and because 
c is a nonzero integer we know that 1 < |c|. Consequently, 

\a\ = \a\ • (1) < \a\ • |c| = |6|. I 

It should be noted that this proof makes use of the following “ facts ” about 
integers :The absolute value of a product equals the product of the absolute 
values; there is no integer between 0 and 1; multiplying an inequality by a 
positive number preserves the inequality. It may also be noted that the proof 
makes heavy use of the assumption that b ^ 0; in fact, if b = 0 the conclusion 
of (vii) is false. 

(viii) If a | (+1), then a—± 1. 

Proof : One way to verify this is to observe that, according to (vii) 9 
a | (± 1) implies \a\ < \ ± 1| = 1. Because 0, it follows that \a\ = 1 = |1|, 
and hence, in virtue of (vi) 9 a = ± 1. | 

(ix) If a | b and b \ a , then a — ±b. 

Proof: By the definition of divisibility, we know that a^0 and b # 0. 
Applying (vii) yields \a\ < \b\ and |&| < \a\. Consequently, \a\ = \b\ 9 and by 
(vi) a = ±b. | 

1-1-3 / PROBLEMS 

1. How many integers between 20 and 200 are divisible by 7 ? How many are 
divisible by 11 ? 

2. Prove that if ac \ bc 9 then a \ b. If a | b and c | d 9 then ac | bd. 

3. Show that if a \ b and 0 <b <a 9 then b = 0. 

4. For a = 10, how many integers d are there such that d\ a and 1 <d<al 
What is the sum of all such divisors dl Answer the same questions for 
a = 17, 24, 72, 77, 79, 97, 210, 420. 
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5. Can you find an integer greater than 1 which divides both a = 5311 and 
b =151X1 Can you find a positive integer less than ab which is a multiple of 
both a and b ? 


6. Do Problem 5 for a = 4181, b = 7663. 


7. For a , b, r u r 2 e Z prove that: 

(0 \ab\ = \a\\b\, 

(ii) I a + b\ <, \a\ + \b\, 

(Hi) \a — b\< \a\ + \b\, 

(iv) If 0 < r t < a and 0 < r 2 <a, then |r x — r 2 \ < a. 

Of course, the proofs will apply also when a , b , r u r 2 are real numbers. 

8. (/) List some integers that can be expressed as linear combinations of 

b = 3 and c = 6, and others which cannot be so expressed. In which 

class does the integer 1 fall ? 

(ii) Answer the same questions for b = 3, c = 5. 

(iii) Answer the same questions for b = 4, c =6. 


9. Write out, in words, the definition of each of the following sets: 


(0 {2n\ne Z}, 

(iii) {2x + 3y\x, y e Z}, 
(v) {b — xa\xe Z}, 
(vii) {d> 0\d\a), 


(ii) {2m + 1 I me Z}, 
(iv) {ax + by \ x, y e Z}, 
'd\a} 9 


(vi) {de Z 
(viii) {n> 0 




S<- = 

i= 1 


n{n + 1) 


}' 


1-2. The Division Algorithm 

Now that the elementary properties of divisibility are understood, we turn 
to a crucial result which is normally taken for granted. Given a, be Z with 
a> 0 , then a may divide b or it may not; however, there always exists an 
expression for b in terms of a , and this expression also settles the question of 
divisibility—namely, 


1-2-1. Division Algorithm. Given a, be Z with a > 0, then there exist 
unique integers q and r such that 

b =qa + /*, 0 < r < a. 


Proof: This result is, of course, quite familiar. To illustrate—if a = 1 is 
fixed, then taking in turn b = 10, 5, 28, —5, —91, 94, 1,0, — 1 we have 
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10 = 1 • 7 + 3, 

5 = 0 • 7 + 5, 

28 = 4 • 7 + 0, 

— 5 = (— 1) • 7 + 2, 
-91 =(-13) -7 + 0, 

94= 13-7 + 3, 

1 =0-7+ 1, 

0 = 0 • 7 + 0, 

— 1 = (— 1) • 7 + 6. 


As for the division algorithm itself, there are two assertions to be proved— 
firstly, the existence of q and r, and secondly, that there is only one pair of 
integers q and r which satisfy the conditions. 

Existence : The proof of existence is somewhat formal, so it is useful to 
sketch the geometric idea on which the formal proof rests. Suppose for illus¬ 
trative purposes that b > a. Let us take the point 0, that is the origin, on the 
real line (also known as the “number line,” and which is just another, more 
geometric, way of referring to the set R of all real numbers) and then mark off 
the point a. We can then mark off the point 2 a by “ adding ” the segment of 
size a to itself. This process may be repeated as many times as desired, giving 
a , 2a, 3a, 4a,.... Clearly, if a is added to itself enough times, we arrive at a 
number bigger than b. (This is, of course, an example of the “Archimedean ” 
property of the real numbers.) In particular, there is a last, or biggest, multiple 
of a, call it qa , such that qa<b \ and the next multiple of a, {q + l)a, is greater 
than b . In other words, we have 

qa< b <(q + l)a 


and the picture on the real line is 

_I_!_L 

I I I 

0 a 2a 


b 


I 

qa 


i 


(q + 1 )a 


We may now put r = b — qa — so that b — qa + r and 0 < r < a. 

Suppose that for any integer «, we let [na, {n + l)a) denote the set of all 
real numbers x such that na < x < (n + l)a—so 

[na, (n + l)a) = {x e R | na < x < {n + l)a}. 

Geometrically (that is, on the real line), [na , {n + l)a) is an “interval” which 
includes its endpoint on the left but not its endpoint on the right. The geo¬ 
metric idea underlying the discussion above is that as n runs over Z, b falls 
in an interval (in fact, exactly one) of form [; na , (n + l)a). 
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The preceding sketch of a proof is defective in that it seems to make use 
of real numbers and of a geometric “picture ” We turn, therefore, to a more 
“satisfactory” proof—namely, a formal algebraic proof in which objects 
outside of Z never occur. 

Given a, be Z with a > 0, let 

S = {b — xa>0\xe Z}. 

In words, S is defined to be the set of all those integers of form b — xa which 
are greater than or equal to 0, when x is allowed to run over Z. (Note that 
the vertical stroke “ | ” which appears in the definition of S has nothing to do 
with divisibility.) To put it another way, as x runs over Z, then presumably 
some of the integers b — xa are greater than or equal to 0 and some are less 
than 0—we consider only those b — xa which are greater than or equal to 0. 
Of course, S may equally well be written as 

S = {b — xa | x e Z, b — xa > 0}. 

We assert first of all that S ^ 0 (the proper way to read this notation is: 
S is nonempty). To show this, it suffices to exhibit an x for which b — xa> 0. 
Suppose we try x = —16|. Because a>l we have |Z>| • a > \b\ • 1 = |Z>|, and 
then adding b to both sides gives 

b + \b\-a>b+ \b\. 

But b + \b\> 0, so that b + \b\ • a > 0, and hence x = — \b\ works. (By 
retracing the steps of our argument, it is not hard to see why we chose to 
try x= —16|—it was not pulled out of a hat.) 

Since S is a nonempty set of nonnegative integers, it contains a smallest 
element (the existence of such a smallest element appears obvious, but it is 
really an assumption about Z)—call it r. Denote a value of x for which r 
arises, by q. (In other words, q is a value of x for which b —qa = r.) Thus, 
r >0, r = b — qa, and r is less than any other element of S. 

It remains to show that r < a. Suppose this is false—that is, suppose r > a. 
Now consider the integer 

r — a — {b — qa) — a = b — (q + 1 )a. 

Since r — a > a — a = 0, we see that (r — a) e S. Furthermore, r — a < r 
because a > 0. This says that r — a is an element of S which is smaller than r 
—contradicting the choice of r. Our assumption that r > a having led to a con¬ 
tradiction, we conclude that r < a. The existence part of the proof is now 
complete. 

Uniqueness : We show that there cannot exist two distinct expressions of 
the desired form for b . More precisely, suppose 
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b = q t a + r u 0 <r 1 < a , 

b = q 2 a + r 2 , 0 <r 2 < a 

are two expressions for b; we must show that r i = r 2 , ^ = # 2 . 

If/*! = r 2 , then < 7 ^ = # 2 a, and since a^O we obtain, by cancellation, 
< 7 i = ^ 2 —so the two expressions for b are the same. Suppose, therefore, that 
r x # r 2 —say r t < r 2 . From the two expressions we obtain 


(<h - q 2 )a = r 2 -r 1 . 

Since 0 < r t < r 2 < a, it follows immediately that 0 < r 2 — r t < a —so 

0 < (<7i — qi)a < a . 

Because a > 0, 0 < (q t — q 2 )a implies (< 7 i — q 2 ) > 0—so (q^ — q 2 )> 1 and 
(< 7 i — ^i) a ^ which contradicts — q 2 )a < a . The assumption that # r 2 
is therefore untenable, and the proof of the division algorithm is finally 
complete. | 

It is worth noting that our uniqueness proof shows that if there are two 
expressions for b , then they are identical; no use is really made of existence. 
Thus, it would have been perfectly feasible to prove uniqueness first. 

There is still another way to visualize the existence part of the division 
algorithm. Suppose both a and b are positive, and consider any set of b 
objects. Start to rearrange these b objects in rows of a elements each. In the 
end, we have a certain number of complete rows, and there may be some 
objects left over. If we let q denote the number of complete rows and r denote 
the number of leftovers, then clearly b = qa + r and 0 < r < a. 

1-2-2. Remark. In the statement of the division algorithm we assumed 
that a > 0. What happens if a < 0? Well, in this situation —a is positive, so 
we can apply the division algorithm to b and —a. Consequently, there exist q 
and r such that 

b = q(—a) + r, 0 < r < \a\. 

But this may be rewritten as b = ( —q)a + r, and, of course, — q is an integer. 
The conclusion is that the division algorithm can also be stated in more 
general form, as follows: 

Given a, be Z with 0, then there exist unique integers q and r such 
that 

b = qa + r, 0 <r< \a\. 

In the future, when referring to the division algorithm, it is to be under¬ 
stood that we have this general version in mind. 
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To illustrate the division algorithm when a < 0, let us fix a = — 37 and take 
in turn b = 1,0, — 1, 111, 63, —63. The procedure mentioned above leads to 
the expressions 

1 = 0 • (—37) -h 1, 

0 = 0 • ( —37) + 0, 

-1 = 1 -(-37)+ 36, 

111 = (-3). (-37) + 0, 

63 = (— 1) • ( — 37) + 26, 

-63 = (2) -(-37)+ 11. 

1-2-3. Remark. In connection with the division algorithm, it is customary 
to call the integers q and r, which are determined uniquely by b and a the 
quotient and remainder, respectively. The remainder r is intimately related to 
the question of divisibility; in fact, it is not hard to see that 

a\b o r = 0 

and also, the logically equivalent statement, 

a)(b o r # 0. 

These assertions are so obvious that it is worthwhile to prove them in detail. 
If r = 0, then b = qa + r = qa + 0 = qa, so that a \ b. Conversely, if a \ b, then 
there exists ce Z such that b = ac; thug, b = c • a + 0, and by uniqueness of 
the division algorithm, r = 0. (Incidentally, this also shows that if a \ b, then 
the integer ce Z such that b — ac is unique!) 

The division algorithm says nothing about actually finding q and r in 
concrete cases. In this sense, it is not really an algorithm—a term usually 
reserved for a straightforward, mechanical, often repetitive procedure for 
carrying out a computation. However, we already 66 know ” how to compute 
q and r; this is what the process of long division, which we all learned long 
ago, is all about. Strictly speaking, it needs to be proved that long division 
really results in q and r —but this would take us too far afield, so we take in on 
faith and leave the reader some food for thought. 

1-2-4. Example. Suppose we fix a = 2. Then according to the division 
algorithm, each be Z can be expressed uniquely in exactly one of the forms 
2q or 2q + 1. If b = 2q (that is, if 21 b) we say b is even; if b = 2q + 1 (that is, 
if 2 )(b) we say that b is odd. It is now easy to verify the simple rules 

even + even = even, even • even = even, 
even + odd = odd, even • odd = even, 

odd + odd = even, odd • odd = odd. 

Suppose that a, be Z with a # 0 are given. Applying the division algo- 
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rithm to b and a determines q and r such that b = qa + r. Now, assuming that 
r # 0, the division algorithm may be applied to a and r. Of course, this process 
may then be repeated. Because several equations arise in this way, it is con¬ 
venient to organize the notation and index the q\ and r’s. Thus, instead of 
b =qa + r we shall write b = q t a + r t ; the next equation will be written as 
a = q 2 r i + r 2i and so on. All this leads to the following result. 

1-2-5. Euclidean Algorithm. Given a, be Z with a ^ 0, we have equations 

of form 

b = q 1 a + r u 0 < r x < \a\ 9 

a = <fo r i + r 2 > 0 < r 2 < r u 

r t = q 3 r 2 + r 3 , 0 < r 3 < r 2 , 

(/th equation) r f _ 2 = q^i-\ + r i9 0 < r t < 

r n - 2 =<lnrn-i +r n , 0<r n <r H - l9 

r n -1 =<ln + i r n + 0- 

Proof : There really is nothing to prove. The Euclidean algorithm asserts 
that our process of division stops after a finite number of steps—in other 
words, eventually we get a remainder of 0. But this is clear since r t > r 2 > 
/*3 > * * * with all terms greater than or equal to 0, and any strictly decreasing 
sequence of nonnegative integers must reach 0. | 

The notation is chosen so that n is the index for which r n is the last non¬ 
zero remainder—and then r n \ r n ^ x . Of course, if a \ b, then r t = 0, and there is 
not much point in discussing the Euclidean algorithm. 

1-2-6. Example. Consider a = 5530, b = 145,299. For these integers the 
Euclidean algorithm is 

145,299 = 26 • 5530 + 1519, 

5530 = 3-1519 + 973, 

1519 = 1-973 + 546, 

973 = 1 • 546 + 427, 

546 = 1-427 + 119, 

427 = 3-119 + 70, 

119= 1-70 + 49, 

70= 1-49 + 21, 

49= 2-21 + 7, 

21 = 3-7. 
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The reader who is interested in the bookkeeping aspects will note that 
#1 = 26, r t = 1519, q 2 = 3 ,r 2 = 913,q 3 = l,r 3 = 546, q A = l,r 4 = A21,q s = 1, 
r 5 = 119, q 6 = 3, r 6 = 70, tf 7 = 1, r 7 = 49, q 8 = 1, r 8 = 21, q 9 = 2 r 9 = 7, 
#io == 3. 

It is of some interest to observe that 7, the last nonzero remainder, divides 
both a = 5530 and b = 145,299. Can you find any additional properties of 
r 9 = 7? 

1-2-7 / PROBLEMS 

1. Describe completely the set S = {b — xa > 01 x e Z} when: 

(i) ^ = 7, = 12, (iri) a= -l,b= 12, 

(iii) a = 7, b = -12, (fo) a = 9, b = 5, 

(y) a = —9, b — —5, 

2. Carry out the division algorithm and the Euclidean algorithm in each 
of the following cases: 

(0 a = 63,020, 6 = 76,084, 00 a = 76,084, b = 63,020, 

(iii) a = -63,020, b = 76,084, (it?) a = -76,084, 6 = 63,020, 

00 a = -63,020, b = -76,084, (vi) a = 113, b = - 10,961, 

(i?/i) ci = 5311, 6 = 7571. 

3. In 1-2-2, the general form of the division algorithm was stated as follows: 
Given a, be Z with a # 0, then there exist unique integers q and r such 
that b =qa + r and 0 < r < \a\. At that time, we showed that q and r 
exist. Prove that q and r are indeed unique. 

4. Show that the sum of any positive integer and its square is even. What if 
the integer in question is not positive ? 

5. Prove that the product of two integers of form An + 1 is also of form 

An + 1; the product of two integers of form An + 3 is of form An + 1; the 

product of an integer of form An + 1 and one of form An + 3 is of form 
An + 3. 

6. Why must the last digit in the square of an integer be one of 0, 1, 4, 5, 6, 9 ? 

7. Prove: If both a and b are odd then a 2 + b 2 is even but not divisible by 
4. Moreover, a 2 + b 2 is not a perfect square—that is, there is no c e Z for 
which a 2 + b 2 = c 2 . 

8. Show that there is no integer n for which 4 divides n 2 + 2. 

9. Prove that if 5J(n, then « 4 is of form 5m + 1. 

10, Show that an integer of form 6n + 5 is also of form 3n — 1, but the con¬ 
verse is false. 
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11. Can 100 be expressed as the sum of two integers one of which is divisible 
by 7 and the other by 11 ? How about 99 ? 101 ? 

12. In the formal proof of the Division Algorithm 1-2-1, why does one begin 
by considering the set 

S = {b — xa > 0 | x e Z} ? 


1-3. The Greatest Common Divisor 

We turn next to a topic whose results can be arranged in such a way that 
they may be viewed as consequences of the Euclidean algorithm. 

1-3-1. Definition. Suppose a # 0 and b ^ 0 are given. Any integer d which 
satisfies the following condition 

(0 d\a and d\b 

is said to be a common divisor of a and b . If, in addition, d satisfies the following 
two conditions 

(«) If c | a and c | b , then c \ d , 

{in) d > 0, 

then d is said to be a greatest common divisor (we abbreviate this to gcd) of 
a and b . 

Expressed in words: A greatest common divisor of a and b is a common 
divisor which is positive and is also divisible by every common divisor. 

As an example, we note that a — 210 and b = 280 have among their com¬ 
mon divisors ±1, ±2, ±7, ±10, ±14, ±35 (are there any others?). Of 
course, 1 and — 1 are common divisors for any choice of a and b. The situation 
regarding gcd is more complicated at this stage. It is not hard for the reader to 
convince himself that 6 is a greatest common divisor of a — 18 and b = 24 
(after all, the only common divisors are ±1, ±2, ±3, ±6), nor is it much 
more difficult to see that 70 is a greatest common divisor of a = 210 and 
b = 280. On the other hand, fora = 5530, b = 145,299 we know from 1-2-6 that 
7 is a common divisor, but the problem of locating a gcd looks messy—even 
more, are we really certain that these integers have a gcd ? 

As a matter of fact, for general a and b , a positive number is defined to be 
a greatest common divisor solely by virtue of its divisibility properties; its 
“ size ” does not enter into consideration, so from this point of view the use 
of the word “ greatest ” is probably deceptive and surely awkward. Thus, the 
definition of gcd raises several questions immediately—namely: 
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(1) Does a gcd exist? 

(2) If a gcd exists, how many are there ? 

(3) If a gcd exists, can we find one ? 

These questions are answered by the following result. 


1-3-2. Theorem. Given a # 0, b ^ 0, then their greatest common divisor 
exists and is unique. Moreover, we can find it explicitly via the Euclidean 
algorithm. 


Proof: Let us treat uniqueness first. Suppose both d and d ' are greatest 
common divisors of a and b , Since d ' is a common divisor, condition (2) 
guarantees that it divides the gcd d —thus, d ' | d. By reversing the roles of d 
and d\ we have d\ d'. Therefore, according to 1-1-2, part (ix) 9 d = ±d\ But 
they are both positive, so d = d'. We have shown that a and b can have at 
most one greatest common divisor. 

Next, let us show that a greatest common divisor does exist. Consider the 
Euclidean Algorithm for the integers a and b —with the notation as in 1-2-5. 
Thus, r n is the last nonzero remainder. We assert that r n is a gcd! To prove 
this, it is necessary to verify the three conditions in the definition of gcd. 

To show that r n is a common divisor, note first that according to the last 
equation (that is, the one for r „_!) r n | r II _ 1 . From the equation immediately 
preceding this one (that is, the equation for r„_ 2 ), it follows that r n | r„_ 2 . In 
this way, by moving upward through the equations of the Euclidean algorithm, 
one by one, it follows that r n divides each of the preceding r t , and a and b , too. 
(The reader may find it instructive here to carry out this discussion for the 
concrete example—in which r n = 7—treated in 1-2-6.) Thus, r n is indeed a 
common divisor of a and b . 

Since r n > 0, in order to complete the proof that r n is a gcd, it remains to 
show that if c is a common divisor of a and b , then c\r n . For this, consider the 
first equation of the Euclidean Algorithm. Since c \ a and c\b, we see that 
c\r v Turning to the second equation, we know that c\a and c\r 1 —so it 
follows that c | r 2 . In this way, by moving downwards through the Euclidean 
algorithm, one equation at a time, we conclude at the end that c | r n . (The 
reader may wish to carry out this argument in the concrete example 1-2-6 and 
verify that if c divides both a = 5530 and b = 145,299, then c\l.) Conse¬ 
quently, r n is a gcd; in fact, it is the only one. The entire proof is now com¬ 
plete. | 

In virtue of this theorem, we now have the right to speak of the greatest 
common divisor of a and b\ we will denote it by (< a , b ). Since the definition of 
gcd makes no distinction between a and b —that is, the roles of a and b are 
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interchangeable—it is clear that (a, ft) = (ft, a). Furthermore, the sign of a 
or of b does not matter, for it is clear that 


(a, b) = (- a , b) = (a, -b) = (- a , -b). 

Consequently, in dealing with questions about the gcd, there is no loss of 
generality in assuming that a > 0 and b > 0. 

1-3-3. Example. Suppose a = 258, b = 354, and let us compute (258,354). 
The Euclidean algorithm says 

354 = 1 • 258 + 96, 

258 = 2-96 + 66, 

96 = 1 • 66 + 30, 

66 = 2-30 + 6, 

30 = 5 • 6. 

Therefore, (258, 354) = 6. Furthermore, as observed above, we now know 
that (—258, 354) = (258, —354) = ( — 258, —354) = 6. Of course, these 
assertions could also be proved directly; for example, applying the Euclidean 
algorithm for a = —258, b = —354 gives 

— 354 = (2)( —258) + 162, 

-258 = (-2X162)+ 66, 

162 = 2-66 + 30, 

66 = 2 • 30 + 6, 

30 = 5 • 6, 

so, once again, ( — 258, —354) = 6. 

1-3-4. Remark. Suppose we compute the Euclidean algorithm for given 
a and b ; then, as has already been proved, the last nonzero remainder r n is 
precisely the gcd (< a , ft). If the first equation is now discarded the remaining 
equations obviously constitute the Euclidean algorithm for r t and a. Hence, 
(r i9 a) = r n . Thus, by discarding the equations (from the top) one at a time, 
it follows that 


r„ = {a, b) = ( r u a) = (r 2 , rj = • • • = ( r n , r„_ x ). 

1-3-5. Remark. For given a and ft, their gcd (a, ft) (which is often denoted 
by d) is the biggest positive integer which divides them both. To see this, one 
observes that if c is any positive common divisor of a and ft, then c | d , and since 
d > 0 it follows [by 1-1-2, part (vii)] that c < d. This property of the gcd— 
namely, biggest positive common divisor—may be used as the definition of 
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gcd; in fact, it is probably the more natural definition. However, in view of 
generalizations and analogies to be discussed later, the definition we have 
given is really the “ proper ” one. 


1-3-6. Theorem. Suppose a # 0, b ^ 0 are given and let d = (a, b ), then: 

(/) Every linear combination xa -f yb of a and b is divisible by d. 

(ii) The gcd d can be expressed as a linear combination of a and b. 

(iii) The smallest positive integer of form xa -f yb is the gcd d . 

(iv) The set of all linear combinations of a and b is identical with the set 
of all multiples of d— in symbols. 

{xa + yb | x, y e Z} = [nd\n e Z}. 


Proof: We proved (/) long ago, in 1-2-1, part (v). To prove (ii) we rewrite 
the equations of the Euclidean algorithm, as given in 1-2-5, by solving for the 
remainder in each case. We have then 

r t = b- q x a 9 

r 2 = a-q 2 r u 

r 3 = r t - q 3 r 2 , 

2 -qji- 1 , 

r n-l = r n- 3 ~ <ln-l r n-2 > 

r n =r n _ 2 -q n r n _ 1 . 

Of course, the first equation implies that r t is a linear combination of a and b . 
Substituting the expression for r t in the second equation gives 

*2 = 0 +02 0i)a + (“02)* 

so r 2 is a linear combination of a and b. Now, when the expressions for r x and 
r 2 are substituted in the third equation, it follows that r 3 is a linear combina¬ 
tion of a and b. In this way, by working downward through our system of 
equations, we arrive finally at an explicit expression for d = r n as a linear 
combination of a and b . 

One can also prove (ii) and find values of x and y for which d= xa + yb 
by working upward through our system of equations arising from the Euclid¬ 
ean algorithm. More precisely, starting from the last equation, we substitute 
in it the expression for given by the next to last equation; thus, 
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r„ = r„- 2 -q n r„- l 

= r„_ 2 - q„(r „_ 3 - q n ^r n ^ 2 ) 

= 0 +?W»-iK-2-?/»-3- 

Now, substitute r n _ 2 = ^,-4 — q n - 2 r n-z in this expression. The process may 
be repeated until, at the end, d = r n is expressed explicitly as a linear combina¬ 
tion of a and b. 

To prove (///), let x 0 and y 0 be integers for which x 0 a + y 0 b is the smallest 
positive integer of form xa -f yb. Let us write d 0 = x 0 a + y 0 b;we must show 
that d 0 = d. Since the positive integer d is, according to («), a linear combina¬ 
tion of a and b , it follows from the definition of d 0 that d 0 < d . On the other 
hand, according to (/), d\ d 0 . We conclude that d 0 = d. 

To prove (iv) 9 note first that, according to (/), every linear combination of 
a and b is a multiple of d —{a 9 b ). Symbolically, we may write 

{xa + yb | x 9 y e Z} c: {nd\n e Z}. 

On the other hand, since d = x 0 a + y 0 b (with x 0 , y 0 as above), any multiple 
of d is of form 


nd = n(x 0 a+ y 0 b) = (nx 0 )a + (ny 0 )b 
— so it is a linear combination of a and b . This shows that 
{nd\n € Z} c= {xa + yb | x 9 y e Z}. 

The proof is now complete. | 

1-3-7. Example. As in 1-3-3, consider a = 258, b = 354. Let us express 
(258, 354) = 6 as a linear combination of 258 and 354. The procedure is, of 
course, the one used in the proof of 1-3-6, part (//). We rewrite the Euclidean 
algorithm, which was computed in 1-3-3, in the form 

96 = 354 - 1 • 258, 

66 = 258 - 2 • 96 , 

30 = 96 - 1 • 66 , 

6 = 66 - 2 • 30 . 

Working downward from the top, we have 
96 = b — a, 

66 = a-2(b-a) = 3a -2 b 9 
30 = (b - a) - (3 a - lb) = 3b- 4a, 

6 = (3a - 2b) - 2(3 b - 4a) = 11a - 86. 
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Consequently, 

6 = (11X258) + (—8X354). 

On the other hand, we can also obtain an expression for 6 as a linear 
combination of 258 and 354 by working upward from the bottom, as follows. 

6 = 66- (2)(30) 

= 66 - 2(96 - 66) 

= 3(66) - 2(96) 

= 3(258 - 2(96)) - 2(96) 

= 3(258) - 8(96) 

= 3(258) - 8(354 - 258) 

= 11(258)- 8(354). 

It may also be noted that to express the gcd of a = 258, b = — 354 as a 
linear combination of a and b , we simply take note of the work above, and 
write 


(258, -354) = 6 = (11)(258) + (8)(-354). 

In similar fashion, we have 

(-258, 354) = (— 11)( — 258) + (-8)(354), 

(-258, -354) = (— 11)(—258) + (8)(-354). 

1-3-8. Proposition. The greatest common divisor satisfies the following 

properties: 

(/) If m > 0, then (ma, mb) = m(a, b). 

(ii) If (a, b) = d and we write a = da', b = db', then (a', b')= 1. 

Proof: (i) In virtue of 1-3-6, part (iii) and the fact that m is positive, 
it is clear that (, ma, mb) is equal to the smallest positive integer of form 
x(ma) -F y(mb ), which is equal to m times the smallest positive integer of form 
xa + yb , which in turn is equal to m(a , b ). 

Furthermore, it is clear that if the condition m > 0 is dropped, this result 
takes the form 

(ma, mb) = \m\ (a, b ). 

07) This follows from (/), since 

d = (a, b) = (iid, dV) = d(a f , b f ) 


implies (a f , V) = 1. | 
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As illustrations of these facts, we note that (258, 354) = (2 • 129, 2 • 177) = 
2(129, 177), (258, 354) = (3 • 86, 3-118) = 3(86, 118), and (258, 354) = 
(6 • 43, 6 • 59) = 6(43, 59); moreover, because it is already known that 
(258, 354) = 6, it follows that (43, 59) = 1. 

1-3-9. Definition. The integers a # 0, b # 0 are said to be relatively prime 
when (< a , b) = 1. 

We conclude this section with several elementary properties of relative 
primeness, the second of which will play a crucial role in the next section. 

1-3-10. Proposition. For integers a , b , c we have: 

(/) a and b are relatively prime <=> there exist integers x 0 , y 0 such that 
x 0 a + y 0 b = 1. 

(ii) If a | fee and (a, b) = 1, then a | c . 

(iii) If a | c, b | c, and (a, &) = 1, then ab \ c. 

(iv) If (a, b) = 1 and (< a , c) = 1, then (a, &c) = 1. 

Proof : (/) This is a trivial consequence of 1-3-6. In fact, if ( a , &) = 1, then 
surely there exist x 0 , y 0 such that * 0 a + y 0 b = 1. Conversely, if x 0 a + y 0 b = 
1, then (a, b) = d must divide 1—so (a, b) = d = 1. 

(n) According to (/), there exist x 0 , j> 0 such that x 0 a + y 0 b = 1; and then 
Xq clc + y 0 be = c, which implies that a | c . 

(«7) Since a | c, we may write c = aa' for some a' e Z. But then b \ aa' and 
(b 9 a) = 1, so in virtue of (ii) we conclude that b | a'. Consequently, ab | aa !. 

(iv) The hypotheses imply that there exist x 09 y 09 x i9 y i such that 
x 0 a + y 0 b= 1 and x^a + y x c = 1. Therefore, 

1 =x 0 a + (y 0 b )• 1 
= x 0 a+ (y 0 b)(x x a + y t c) 

= (x 0 + y 0 bxf)a + (y 0 yi)bc 
which guarantees that (a, be) = 1. | 

1-3-11. Exercise. There is a flaw in our discussion of gcd. It was defined 
for a 0, b # 0 and its existence was proved by making use of the Euclidean 
algorithm. What happens if a \ b ? In this situation, b = qa + 0, so there is no 
nonzero remainder and the existence proof for gcd, which was given in 1-3-2, 
breaks down. The way to avoid this difficulty is to include in the proof of 
1-3-2 a direct proof that if a\b 9 then \a\ satisfies the requirements for a gcd 
of a and b. (The proof is left to the reader.) Thus, if a \ b then (a, b) = \a\ ; and 
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in this case too we could say that the Euclidean algorithm determines the gcd. 
More precisely, in all cases, the gcd is the absolute value of the divisor in the 
last equation (meaning the one with remainder 0) of the Euclidean algorithm. 

Furthermore, if we look carefully at the various results about gcd, especi¬ 
ally 1-3-6 and 1-3-8, it becomes apparent that their proofs do not take care of 
the case where a | 6 . We leave it to the reader to provide the missing proofs. 

The upshot is that all the results about gcd are valid, as stated, for arbitrary 
a 7^ 0, b 0. 

1-3-12. Exercise. What happens if instead of defining gcd only when both 
integers are not equal to 0 , we permit one of them to be 0 —say a ^ 0 , b = 0 ? 
In this situation (which is related to the special case a \ b as treated in 1-3-11) it 
is clear that ( 0 , 0) = \a\ It is easy to show that the results about gcd remain 
valid in this case too. 

1-3-13 / PROBLEMS 

1. For a # 0, b ^ 0, prove carefully that ( 0 , 6 ) = (— 0 , 6 ) = ( 0 , —b) = 
(-a, - 6 ). 

2. In 1-2-6 we saw that 7 = (5530, 145,299) Use the Euclidean algorithm, 
as given there, to derive the following expression for 7 as a linear combin¬ 
ation of a =5530 and b = 145,299 

7 = (-6122)^ + 2336 

3. In each of the following cases find (a, b) and express it as a linear combina¬ 
tion of a and b: 

(i) a =91, b = 143, (ii) a = 143, b = 91, 

(iff) 0 = -143,6= -91, (it?) 0 = 5311,6 = 7571, 

(i?) 0 = 5311,6 = -7571 

4. Prove in detail that ( 0 , 6 ) = 1 <=> there exist integers x l9 y t such that 
x t a + y x b = - 1 (Of course, this result should be compared with 1-3-10, 
part (/).) 

5. Prove that the following properties of gcd hold: 

(/) If ( 0 , 6 ) = 1 and c \ 0 , then (c, 6 ) = 1. 

(ii) If ( 0 , be) = 1 , then ( 0 , 6 ) = ( 0 , c) = 1 . 

(iii) If ( 0 , 6 ) = 1, then (ca, 6 ) = (c, 6 ). 

6 . Suppose ( 0 , 6 ) = d , and let x 0 , y 0 be integers for which x 0 a + y 0 b = d. 
Show that 

(0 Oo, .Vo) = 1. 

(ii) x 0 and are not unique. 
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7. If c > 0, c | a, c | b show that {ajc, b/c) = (a, b)/c. Note that since we deal 
only with integers, the reader must first clarify the meaning of ajc , Z>/c, and 
(a, b)/c. 

8. Show that for every integer a # 0, (a, a + 2) = 1 or 2. Moreover, for 
every n and every a # 0, {a, a -F n) divides n. 

9. Prove: If ( a , 4) = 2 = (Z>, 4), then (<z + Z>, 4) = 4. 

10. Suppose (ia , Z>) = 1—then 

(0 (a + b,b) = 1; in fact, (a + mb , Z>) = 1 for every me Z. 

(ii) (a, b 2 ) — 1; in fact, (a 5 , Z^) = 1 for all positive integers s and t. 

11. Suppose a , b , c, d , m, «, w, and v are nonzero integers with ad — be = ± 1, 
w = a/H + bn, v = cm + dn; show that (m, «) = (w, i?). 

12. If (a, Z>) = 1, show that 

(/) (a + b, a — b) = 1 or 2, 

(«) (a 2 + b 2 , a + b) = 1 or 2, 

(iii) (a 3 + b 3 , a 2 + ^ 2 ) divides a — b, 

(iv) (a + b, a 2 — ab + b 2 ) = 1 or 3. 

13. Consider the set S = {1, 2,..., n}, and let 2 r be the highest power of 2 
which belongs to S. Show that 2 r does not divide any element of S other 
than itself. 

14. Show that {a, b + ma) = (a, b) for all me Z. 

15. In the proof of part (///) of 1-3-6, we started by taking integers x 0 , y 0 for 
which d 0 = x 0 a + y 0 b is the smallest positive integer of form xa + yb . 
How do we know that such integers x 0 , y 0 , d 0 exist? 

16. State the various parts of 1-3-10 “in words.” 

1-4. Unique Factorization 

Given a nonzero integer, we may try to “break it up” with respect to 
multiplication—that is, to factor it—into “ simpler ” integers. From this point 
of view, the simplest integers are those which cannot be factored further; this 
is what underlies the following definition. 

1-4-1. Definition. An integer p > 1 is said to be prime when 1 and p are 
its only positive divisors; if p > 1 is not prime, we say it is composite. 

Thus, to say that n > 1 is composite means that we can write n — ab with 
1 < a < n and 1 < b < n, while saying that n > 1 is prime means that n cannot 
be written in this form. 

For example, 91, 5311, 7571 are composite because 91 = 7 • 13, 5311 = 
47-113, 7571 = 67 • 113. On the other hand, it is easy to see that 2, 3,7,13,19, 
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37,67,79,97,113,119,223,229 are primes. Of course, according to the defini¬ 
tion, 1 is not a prime—nor is it composite. The reason for leaving 1 in limbo 
is that it clearly plays no role in questions of factorization (even though 1 is a 
divisor or factor of any integer); 1 can be included or removed from any 
factorization, so it should appropriately be ignored. 

What about the factorization of negative numbers? If n< —1, then 
n = (—1)(— n) with — n > 1, so the question of factoring the negative number 
n reduces to factoring the positive integer — In addition, +1 and — 1 make 
no meaningful contribution to a factorization, and they may be safely ignored. 
Thus, we shall restrict ourselves, without loss of generality, to factorization 
questions for integers greater than 1. 

In this connection, it should be noted that if p is prime, then — p satisfies 
the same divisibility properties as p —but it contributes no new information. 
It would be awkward and confusing, at best, to keep track of both p and — p , or 
to view them as distinct primes. It is because of this that our definition re¬ 
quires that a prime be positive. 

1-4-2. Proposition. If p is prime, then 

(0 If pX a, then ( p , a) = 1. 

(ii) If p | ab and pj( a, then p | b . 

(iii) If p | a t a 2 a 3 ---a s , then p divides one or more of the a t . 

Proof: (i) The only positive divisors of p are 1 and p , so p)( a implies that 
1 is the greatest common divisor of p and a . Thus, for example, since 7^99 
and 7 is prime, we know that 7 and 99 are relatively prime. 

(ii) In virtue of (/) we have p \ ab and ( p , a) = 1, and then 1-3-10, part (ii) 
implies p \ b . This shows that if a prime divides the product of two integers 
and does not divide the first factor, then it divides the second. 

(iii) Proceed inductively. If p \ a t we are finished; if not, then according to 
(ii) 9 p | a 2 a 3 • • • a s . Now if, p \ a 2 we are finished; if not, then p | a 3 ••• a s . By re¬ 
peating the process, we arrive eventually at an a { which is divisible by p —thus 
showing that if a prime divides a product, then it must divide at least one of 
the factors. | 

1-4-3. Theorem. Every integer greater than 1 can be expressed as a finite 

product of primes. 

Proof : This result may appear obvious, but it does require proof. We give 
a proof by contradiction. 

Suppose the theorem is false; so there exists an integer, greater than 1, 
that cannot be expressed as a finite product of primes (call it m). Let S denote 
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the set of all integers greater than 1 that cannot be written as a finite product 
of primes. Then S # 0 (as m e S ). Since S is a nonempty collection of positive 
integers, it contains a smallest element—call it n; so n is less than any other 
element of S. Now, n > 1 (because neS and every element of S is greater than 
1 ) and n is composite (for if n is prime, then it is already written as a finite 
product of primes—with only one factor—and, therefore, n could not belong 
to S ). Therefore, we may write 

n — ab, 1 < a < n, 1 < b < n. 

But then, because n is the smallest element of S , we have a $ S, b <fc S; so both 
a and b can be expressed as a finite product of primes. Consequently, their 
product n = ab can also be written as a finite product of primes—which 
contradicts n e S. Our original supposition that the theorem is false having 
led to a contradiction, we conclude that the theorem is true. | 

We digress from the main theme of this section to deal with two related 
results whose proofs make use of 1-4-3. 

1-4-4. Theorem. (Euclid: circa 300 B.c.). The number of primes is infinite. 

Proof : Suppose the number of primes is finite, and denote all of them by 
Pi <Pi <Ps < * * * <Pn • 

Thus, pt = 2,p 2 = 3, p 3 = 5, = 7, and so on. Now consider the integer 

N=PlP 2 P 3 ---Pn + 1 - (*) 

In words, N is obtained by adding 1 to the product of all the primes. Since 
N is not a prime (because, according to our supposition, p n is the biggest 
prime, and N is bigger than p n ), it can therefore be expressed as a product of 
primes. But there are no primes other than p u p 2 ,..., p n , so at least one p t 
appears in the prime factorization of N. In particular, this p t divides N. 
Because p t divides the product PiP 2 • • • p n , it follows from (*) that p { \ 1—a 
contradiction. The initial assumption that the number of primes is finite having 
led to a contradiction, we conclude that the number of primes is infinite. | 

Suppose we apply the division algorithm for an arbitrary integer b and 
a = 4; then b is clearly of exactly one of the following forms: 


I: 

4 n. 

II: 

4 n + 1, 

III: 

4 n + 2, 

IV: 

4/i + 3. 
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Obviously, any integer of type I or III is even, while any integer of type II or 
IV is odd. Conversely, any even integer must be of type I or III, while any odd 
integer must be of type II or IV. 

There are no primes of type I, and there is exactly one prime of type III— 
namely, the prime 2. The remaining primes, which by 1-4-4 are infinite in 
number, are odd—so each of them is of type II or IV. One may ask, how 
many primes are there of form 4n -f 3 (that is, of type IV) ? In other words, 
how many primes appear in the arithmetic progression 

3, 7, 11, 15, 19, 23, 27,31,... 

with initial term 3 and difference 4? As is expected perhaps, the answer is as 
follows: 


1-4-5. Theorem. The number of primes of form 4n -F 3 is infinite. 


Proof : Our proof is by contradiction and involves a slight refinement of 
the argument used to prove 1-4-4. 

Suppose there are only a finite number of primes of form 4 n + 3. Denote 
them by 


Pi <Pi <Ps <-”<Pm- 

In particular, p x = 3,p 2 = l,p 3 = 11, = 19, p 5 = 23,... and we are assum¬ 

ing that p m is the biggest prime of form 4n + 3. Consider the integer 

M = 4p 2 p 3 -‘-p m + 3, 

which is surely of form 4 n 4* 3. We observe that M is composite (because p m 
is the largest prime of form 4 n -F 3), P\fM (because p x = 3 does not divide 
4p 2 p 3 • ••/?„,), and p { )(M for / = 2, ...,m (because any such p t does not 
divide 3). Therefore, the factorization of M cannot contain any primes of form 
4 n -F 3. Furthermore, M is odd, so its factorization cannot include the prime 
2. It follows that all the prime factors of M are of form 4n -F 1. But the 
product of two, or more, integers of form 4n -F 1 is again of form 4n -F 1, 
which implies that M must be of form 4 n + 1—contradiction. Hence, the 
number of primes of form 4n -F 3 is infinite. | 

It is also known that the number of primes of form 4n -F 1 is infinite. The 
reader may wish to undertake the proof, but he will probably fail because our 
present techniques are insufficient to win through to the end. However, a 
proof of this fact will be given in Chapter III. 

There is a deep and famous theorem of Dirichlet (1805-1859) that settles 
all questions of this sort. It says, roughly, that any arithmetic progression 
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contains an infinite number of primes—more precisely, if c ^ 0, d ^ 0, and 
(c, d) = 1, then the sequence 

c, c + d, c + 2d, c + 3d ,..., 

contains an infinite number of primes. In our terminology this says that for 
c # 0, d 0, (c, d) = 1 there are an infinite number of primes of form c + md. 

Let us return to the main theme. We have seen that any integer greater 
than 1 can be written as a product of primes. It is easy, for example, to factor 
7200—namely, 7200 = 2 • 2 • 2 • 3 *3 • 2 • 2 • 5 • 5; on the other hand, trying 
to factor 24,523 is considerably more difficult—it turns out that 24,523 = 
137- 179. In fact, the theorem on existence of a prime factorization says 
nothing about how to find a factorization, and there really is not much we 
can do about it. However, another interesting question arises. In how many 
ways can an arbitrary integer greater than 1 be factored into primes? The 
fact that we are “certain ” that there is essentially only one factorization (after 
all, this is what experience tells us) does not necessarily make it so. Of course, 
in virtue of the commutative law for multiplication, the order in which the 
prime factors appear does not matter—for example, we should clearly not 
consider 

2*2*2*3*3-2*2*5*5 and 2-3-2-3-2-5-2-5-2 

as distinct factorizations. Except for this unavoidable limitation, we have the 
best of all possible worlds—namely: 

1-4-6. Theorem. The factorization of any integer greater than 1 into 
primes is unique up to order. In other words, the order in which the terms 
of the prime factorization are listed does not matter. 

Proof: Suppose the theorem is false; so the set of S of all integers greater 
than 1 which have more than one prime factorization is nonempty. Then S 
has a smallest element—call it n. Thus, n has at least two prime factorizations, 
and any integer greater than 1 and less than n has a unique prime factor¬ 
ization. Consider any two distinct factorizations 

n=PiP2'"Pr = qiq2'"<ls, 

where every p t and qj is prime. (We do not assert that all the p 9 s are distinct 
or that all the #’s are distinct; any of the primes may well appear several 
times.) 

We must have r >2 and s> 2; in other words, each side of the equation 
must have at least two factors. For if one side has only a single factor, we 
have, for example, 


n =Pi =q\q2---q s > 



26 


/. ELEMENTARY NUMBER THEORY 


which provides a factorization of the prime p t . This being impossible, we must 
have s = 1 and p i =q u which says that the two factorizations are identical 
and contradicts the basic property of n. So indeed r > 2 and s > 2. 

Now consider the prime p t . It divides the right side so, according to 1-4-2, 
it divides at least one of the q y s. Reindexing the q's if necessary (here is where 
the question of order of the factors enters) we may assume that p t divides q t . 
But q t is prime, so it is immediate that p 1 = q±. Consequently, 

n=PiP2-'-Pr=Pi<h---q s > 
and if we write n = p x n\ then 

ri =P 2 ---Pr = <l2-"<ls- 

Surely, ri > 1 (because r > 2, which implies n # p t ) and ri < n. Therefore, it 
follows from the choice of n that the factorization of ri into primes is unique; 
so with reordering, if necessary, we have 

Pi = $2 9 Pz =? 3 > --^Pr = <lr and r = s. 

Because p t —q^ we see that our two distinct factorizations of n are the same— 
contradiction. The proof of the theorem is now complete. | 

1-4-7. Remark. Although existence and uniqueness of factorization were 
proved for integers n > 1, these results clearly apply also for negative n —one 
simply takes out a factor — 1 at the start. Our general result, therefore, reads 
as follows: 

Any integer n # 0 can be expressed uniquely in the form 
n = (±l)PlP2---Pr> 

where the p { are primes and exactly one of +1, — 1 occurs. 

It should be noted that the proofs of existence of a prime factorization 
1-4-3 and of uniqueness of prime factorization are independent of each other; 
either theorem could be proved first. The existence proof makes use of none 
of the information we have collected about “number theory.” On the other 
hand, the uniqueness proof uses 1-4-2, which in turn depends on facts about 
the gcd. 

1-4-8 /PROBLEMS 

1. Find all primes less than 300. [Hint: One way to do this mechanically is 
by the so-called “ sieve of Eratosthenes.” First write all the integers from 
2 to 299. Then starting from 2, cross out every second number, while 
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keeping 2. This eliminates all numbers divisible by 2. Then, 3 being the 
smallest number (in fact, prime) remaining, keep it, and cross out every 
third number starting from 3 (note that the numbers that have already 
been crossed out are counted.) Next, go on to 5 and cross out every fifth 
integer. This process is to be continued as long as necessary.] 

2. If the positive integer n is composite, show that it has a prime factor p 
satisfying p < yjit. Thus, to decide if n is prime it suffices to check that it 
is not divisible by any prime less than yjn. This also tells us how far to go 
in the sieve of Eratosthenes. 

3. Express every prime less than 225 which is of form An -f 1 as a sum of 
two squares. 

4. Prove that any prime of form 3n + 1 is of form 6 n + 1 . 

5. Show that any positive integer of form 3n + 2 has a prime factor of the 
same form. 

6 . In the proof of 1-4-4—that the number of primes is infinite—use is made 
of the integer N = p^p 2 Pz " * Pn + 1- What happens if we take 

N= Pn ! + l? 

7. We proved 1-4-4 and 1-4-5 (the infinitude of primes and the infinitude of 
primes of form An -f 3) immediately after proving the existence of prime 
factorization 1-4-3, but before proving uniqueness 1-4-6. Show that 
uniqueness is not needed for the proof of 1-4-4 or 1-4-5. 

8 . Prove carefully the statement in 1-4-7; namely, any integer n # 0 can be 
expressed uniquely in the form n = (±l)p 1 p 2 99 9 p r where the p t are 
primes and exactly one of +1, — 1 occurs, What happens when n = ± 1 ? 

9. Suppose that ( a,p 2 ) = p , ( 'b 9 p 4 ) = p 2 and p is prime. Evaluate: 

(0 (ab,p 5 ), (ii) (a + b, p% 

(in) .(a - b, p s ), (iv) (pa - b, p 5 ). 

10. Suppose that (a, b)=p,p prime. Evaluate 

(0 (a 2 , b), (ii) (a 2 , b 2 ), 

(iii) (a 3 , b), (iv) (a 2 , b 3 ). 

In each case, give numerical examples to illustrate your answer. 

11. For any positive integer «, show that 

(0 21 (n 2 - ri), (ii) 61 (n 3 - n), 

(iii) 301 (n 5 — n). 

12. Show that the product of three consecutive integers is divisible by 6 . 
Moreover, if the first one is even, the product is divisible by 24. 



28 


/. ELEMENTARY NUMBER THEORY 


13. Show that the product of four consecutive integers is divisible by 24. 

14. Prove: If the sum of two fractions ajb and cjd in lowest terms (meaning 
that (<a , b) = 1, (c, d) = 1) is an integer, then b = ±d. 

15. Prove that the number of primes of form 6 n -f 5 is infinite. 

16. If d = (a, b) and b > 0, show that the finite sequence 

a , 2a , 3 a 9 ... 9 ba 

has exactly d of its terms divisible by b . 

17. For p prime, decide if the following statements are true or false; if true, 
prove the statement; if false, give a counterexample. 

(0 P I (a 2 + b 2 ), p I (b 2 + c 2 ) => p I (a 2 - c 2 ). 

(ii) p | a 1 =>p | a, 

(iii) p\a,p\(a 2 + b 2 ) =>p\b. 

(iv) p\(a 2 + b 2 \ p\(b 2 + c 2 ) =>p | (a 2 + c 2 ). 

(v) a 3 \b 3 =>a\b. 

(vi) a 3 \b 2 =>a\b. 

(vii) a 2 \b 3 =>a\b. 

18. Consider the set 

2 Z = {In | n e Z}, 

which is, of course, the set of all even integers. We can add, substract, or 
multiply two elements of 2 Z, and the result is an element of 2 Z. In 
analogy with Z, we can discuss questions of divisibility and factorization 
in 2 Z. (Note, for example, that in 2 Z, 2 does not divide 6 = 2*3, but 2 
divides 4 = 2*2). We can also introduce the notion of a prime in 2 Z— 
namely, a positive element of 2 Z whose only positive divisor in 2 Z is 
itself, (Note that 1 £ 2 Z, so we ignore it as a possible divisor of a prime.) 
(0 List some primes of 2 Z. Can you describe all the primes of 2 Z ? 

(ii) Prove that every positive element of 2 Z can be expressed as a pro¬ 
duct of primes of 2 Z. 

(iii) Show that this factorization into primes need not be unique. 

(vi) What about negative primes and the factorization of negative ele¬ 
ments of 2 Z ? 

19. Is 999,991 prime? 

20. Show that « 4 + 4 is composite for every n > 1. 

1-5. A Convenient Notation 

Consider an integer a > 1 and its unique factorization into primes. A 
given prime p may appear more than once in this factorization and, as is 
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perfectly natural, we combine such repeated terms. In this way, the factoriza¬ 
tion of a is expressed in terms of distinct primes raised to appropriate powers. 
For example, if a — 7200 = 2- 3*2-3*2*5-2*5*2we would write 

a = 7200 = 2 5 • 3 2 • 5 2 . 

Similarly, for b = 9996 = 2- 3*7-2-17-7 we would write 
b = 9996 = 2 2 • 3 • 7 2 • 17. 

Of course, these factorizations of a = 7200 and b = 9996 in terms of prime 
powers are unique up to order. Now, we can compute the product ab , but of 
much greater interest here is the fact that we can immediately produce the 
factorization of ab in terms of prime powers—namely, 

ab = (7200)(9996) = 2 7 • 3 3 • 5 2 • 7 2 • 17. 

How is this done ? For a prime which appears in both factorizations we add its 
two exponents; for a prime which appears in just one factorization, we view 
it as appearing to the power 0 in the other factorization and again add expo¬ 
nents. Naturally, primes which appear in neither factorization are ignored. In 
other words, we “really” write 

a = 7200 = 2 5 • 3 2 • 5 2 • 7° • 17°, b = 9996 = 2 2 • 3 1 • 5° • 7 2 • 17 1 , 

and add the exponents for each of the primes. 

Let us return to the general situation. Any a > 1 can be expressed uniquely 
(up to order, of course) as a product of prime powers—that is, 

a = Pl' P?---P r m, 

where PuP 2 , • ••^Pm are distinct primes and all the exponents r l9 r 2 ,..., r m 
are greater than 0. Naturally, another integer b > 1 has an expression of form 

b = q{ 1 

where q u q 2 ,..., q n are distinct primes and the exponents s u s 2 ,..., s n are 
greater than 0. If we try to write out the prime power factorization of ab , 
things turn out awkwardly because the notation gets in the way. In general, 
some primes appear among both the p's and the < 7 ’s, other primes occur only 
among the /?’s, and still others only among the q's. It is obviously a mess to 
carry out, in a theoretical context, the procedure employed above in the 
numerical example a = 7200, b = 9996—namely, when a prime occurs in just 
one of the factorizations, insert it into the other factorization with exponent 0. 
After juggling and renaming the /?’s and #’s the result is, in the general case, 
expressions for a and b such that exactly the same primes appear in the two 
expressions, except that 0 exponents can occur. Then to multiply a and b we 
simply add exponents for corresponding primes. Note that we do not even 
try to set up the required notation. 
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Suppose further that we consider the same element a and replace b by an 
arbitrary integer c > 1. As above, by inserting appropriate primes with expo¬ 
nent 0, we arrive at expressions for a and c in which exactly the same primes 
occur. However, since c and b need not have the same primes appearing in 
their original factorizations (that is, when all exponents are greater than 1) the 
expression we obtain here for a is not, under normal circumstances, identical 
with our previous expression for a. For example, we have seen that for a = 
7200, b = 9996 we obtain expressions 

a = 2 5 • 3 2 • 5 2 • 7° • 17°, b = 2 2 • 3 1 • 5° • 7 2 • 17 1 . 

If we then take c = 80,465 = 5 • 7 • ll 2 • 19, we obtain for a and c the ex¬ 
pressions 

a = 2 5 • 3 2 • 5 2 • 7° • 11° • 19°, c = 2° • 3° • 5 1 • 7 1 • ll 2 • 19 1 . 

Clearly, the notation (especially because it is not unique) leaves something to 
be desired, and it would be nice to have a uniform notation which is a help 
rather than a hindrance. Our approach will be to look at all primes and 
simultaneously to keep track of the exponents. 

1-5-1. Definition. For any prime/? and any integer a > 1, we let v p (tf) (read 
this as “ nu sub p of a ”) denote the exponent to which p appears in the factor¬ 
ization of a . 

Of course, it is understood that when p is not a factor of a , so that it does 
not appear in the factorization of a , we write v p (a) = 0. This amounts to 
taking the formal view that p° is the power of p which appears in the factoriza¬ 
tion of a . 

Some concrete examples should help to clarify the definition. For a = 
7200 = 2 5 • 3 2 • 5 2 , we have 

v 2 (7200) = 5, v 3 (7200) = 2, v 5 (7200) = 2, 

Furthermore, if p is any prime other than 2, 3, or 5, then it does not divide 
7200 and does not appear in the factorization; so 

v/7200) = 0 for all/? #2, 3, 5. 

In similar fashion, if b = 9996 = 2 2 • 3 1 • 7 2 • 17 1 , then 

v 2 (9996) = 2, v 3 (9996) = 1, v 7 (9996) = 2, v 17 (9996) = 1 

and, as for the remaining primes, 

v p (9996) = 0 for all p * 2, 3, 7, 17. 
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Finally for c = 80,465 = 5 • 7 • ll 2 • 19 we have 

v 5 (80,465) = 1, v 7 (80,465) = 1, v ljL (80,465) = 2, v 19 (80,465) = 1, 

v p (80,465) = 0, # 5, 7, 11, 19. 

It should be noted that the definition of v p (a) depends on two “ variables ”: 
the integer a > 1 and the prime p. Our discussion will emphasize fixing a , and 
studying the collection of all v p (a ) as p runs over the set of all primes. 

Suppose we fix a > 1 and look at the prime factorization of a. For the 
primes p that appear in this factorization we have clearly v p (a) > 0; these are, 
of course, the primes which divide a , and there are only a finite number of 
such primes. On the other hand, for the primes p that do not appear in the 
factorization of a we have v p (a) = 0; these are the primes that do not divide a , 
and there are an infinite number of such primes. This information may be 
summarized and restated as follows: 


1-5-2. Proposition. Given a > 1 we have 

(0 v P ( a ) ^ 0 for all /?, 

(ii) v p (a) = 0 for almost all p , 

(Hi) v p (a) > 0 op divides a, 

(, iv ) v p (a ) = 0 <=> p does not divide a. 

Proof : The proof has already been given. It is only necessary to note that 
the phrase “ for almost all p ” is common mathematical parlance—its meaning 
is “for all but a finite number of p” | 

Suppose a > 1 is fixed, what profit is there in knowing v p (a) for every 
prime pi A concrete example will help us understand the answer. Consider 
a = 142,025 = 5 2 • 13 • 19 • 23, so that 

v 5 (142,025) = 2, Vi 3 ( 142,025) = 1, v 19 (142,025) = 1, v 23 (142,025) = 1, 

v p (142,025) = 0 for all p ± 5, 13, 19, 23. 

Now, given all the exponents v p (142,025), let us consider the expression 

2° • 3° • 5 2 • 7° • 11° • 13 1 • 17° • 19 1 • 23 1 -29° -31° 

It is formed by writing each prime p with the exponent v p (a); of course, the 
three dots at the end indicate that we run through the infinite collection of all 
primes and that the missing primes occur with exponent 0. 

This expression may be viewed as a formal expression in which we have a 
term p Xp ^ a) for each prime p; however, we prefer to view it as a product. The 
fact that it seems to be an infinite product is no problem—in fact, because 
v p (a ) = 0 for almost all p , we see that almost all terms are p Vp(a) =p° = 1, so 
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only a finite number of terms in the product are not equal to 1, and we are 
really dealing with a product of a finite number of terms. As the reader has 
probably observed already, this product is precisely a = 142,025 itself; after 
all, what we have effectively done is take the factorization 5 2 • 13 • 19 • 23 of 
a and insert every other prime with exponent 0. Thus, knowing the set of 
values (v p (142,025) \ p prime} enables us to recapture a = 142,025; we simply 
take the product of all terms of form ^M 142 * 025 ). 

Next, let us turn to the case of an arbitrary integer a > 1. Consider the 
infinite product 

n p vp(a) 

p 

where p runs over all primes. This is a compact notation for the infinite prod¬ 
uct, and is obviously more economical than writing it out as was done above. 
Of course, the product is really finite; to see this one notes that because 
v p (a) = 0 for almost all p, there are only a finite number of factors p Vp(a) which 
are not equal to 1. Moreover, it is clear that these factors p Vp(a) ^ 1 are pre¬ 
cisely the terms of the prime power factorization of a —so 

a = Up Vp(a) - (*) 

P 

Furthermore, we note that the set of all v p (a), {v p (a) |p prime}, determines a , 
in the sense that if a is lost, then it can be recaptured from the v p (tf)’s by 
writing n p Vp(a \ In particular, p Vp(a) is the highest power of p which divides a. 

1-5-3. Proposition. Any a > 1 has an expression of form 

a — n P Vp(a) • 

p 

Moreover if, in addition, b > 1, then the following are equivalent: 

(0 a = b, 

(ii) v p (a) = v p (b) for all p, 

(in) n p vp(a) = n p Vp(b) ■ 

p p 

Proof : The validity of the expression for a was proved above. Of course, 
b has an expression of form 

b = n P Vpib) 

p 

and we must show that 


(/) <=> (ii) o (iii). 
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From the definition of v p and the foregoing discussion we have 
a — b => v p (a) = v p (b) for all p 

=> n p vp(a) = n p vp{b) 

=> a = b. 

This shows that 

(0 => (a) => (Hi) => (/) 

and completes the proof. | 

The importance of this result for us is that it provides a criterion for 
deciding when two integers, both greater than 1, are equal—namely, when the 
corresponding v p ’s are equal for every p. 


1-5-4. Remark. We may extend the definition of v p to other integers. 
Since 1 may be written in the form 

1 = 2° • 3° • 5° • 7° • 11° • • •, 


it is clear that for any prime p we should define 


and hence, 


v p (l) = 0 


l =Up Vp( 1} 

p 

—which tells us that the expression (*) is valid also for a = 1. 

What about v p of negative numbers? Any negative number is of form —a 
where a > 1. But then 


-a = (-l)a = (-l)np , ' w 

P 

is essentially a factorization of — a . Consequently, we define v p (— a) as the 
exponent to which p appears in the factorization of —a (which is identical 
with the way in which v p (a) is defined for positive a ), and it follows immedi¬ 
ately that 


v p (—a) = v p (a) for all p. (**) 

It is now clear that the assertions of 1-5-2 hold for all a # 0, and 1-5-3 holds 
for a > 0, b > 0. 

As for v p (0), at this stage, it is best to brush it aside saying that it is un¬ 
defined. 


1-5-5. Proposition. If a # 0, b # 0, then 

v p (ab) = v p (a) + v p (b) for all p. 
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Proof : In virtue of (**) there is no loss of generality in taking a >0, b > 0. 
We have the expressions 

a = n P Vp(a \ b = Ft P Vp(b \ ab = Tl P Vp(ab) - 
On the other hand, 

ab = (n p vp(a) ) (n p vp(b) ) = n p vp(a)+vp(b) 

because multiplication involves simply adding exponents for each p. (Here is 
one place where we benefit from our notation which treats all primes equally.) 
We have, therefore, 

Yi = n pVp(fl)+v p (b). 

Now, both v p (a) and v p (b) are equal to 0 for almost all p, so v p (a) + v p (b) 
equals 0 for almost all p. Thus, the right side is indeed a finite product, as is 
the left side. We have, therefore, two prime power factorizations of the same 
integer ab. By uniqueness, the factorizations are identical. Putting everything 
together, we conclude that v p {ab) = v p (a) + v p (b) for all p. | 

The above proof involves a good deal of fussing, and there remain a num¬ 
ber of details that were glossed over; however, the statement could easily be 
considered as obvious from the start—after all it simply says that for each 
prime p, its exponent in the product equals the sum of its exponents in the two 
terms. 

Let us turn to questions of divisibility. Among the divisors of 7200 = 
2 5 • 3 2 • 5 2 we certainly have 2, 2 • 3, 3 • 5 2 , 2 5 • 3, 2 4 • 3 2 • 5, and so on. Even 
more, in virtue of unique factorization it is clear that an integer of form 

2 ri 3 r2 5 r3 , ^>0, r 2 > 0, r 3 > 0 

is a divisor of 7200 if and only if r t < 5, r 2 < 2, r 3 < 2, and also that every 
divisor of 7200 is of this form. Although we have not mentioned the primes 
p # 2, 3, 5 explicitly, there is no doubt that the set of all exponents tells the 
story. The general result is as follows: 

1-5-6. Proposition. If a # 0, b # 0, then 

a\b o v p (a) < v p (b) for all p. 

Proof : This is obvious from the way multiplication works, but we choose 
to give a formal proof. As before, there is no loss of generality in assuming 
a > 0, b > 0. Now 

a | b => there exists c > 0 such that ac = b 
=> v p (a) + v p (c) = v p (b) for all p 
=> v p (a) < v p (b) for all p. 
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As for the converse, if v p (a) < v p {b) for all /?, then, first of all, v p (a) = 
v p (b) = 0 for almost all p . Therefore, v p (b) — v p (a) > 0 for all /?, and 
v p (b) — v p (a) = 0 for almost all p . This permits us to define 

Y[ p v p( b >“ v p( fl ) 

(which is really a finite product) and to call this positive integer c . Conse¬ 
quently, 

ac = (n p vp(a) ) (n p v r (b) - v r (a) ) = n /> vp(b) = b 

and the proof is complete. | 

We have just seen that the set of all exponents provides a tool for dealing 
with questions of divisibility. It is not surprising, therefore, that the set 
of all exponents enables us to describe and analyze the gcd. For example, 
suppose a = 7200 = 2 5 • 3 2 • 5 2 , b = 9996 = 2 2 • 3 1 • 7 2 • 17 1 , c = 80,465 = 
5 1 • 7 1 • 11 2 • 19 1 . (To keep the notation simple we leave out the terms with 0 
exponent.) Among the common divisors of a and b we clearly find 2, 2 2 , 3; it 
is also obvious that 

(a, b) = 2 2 • 3 1 . 

In similar fashion, because the story is in the exponents, we see that 

(a, c ) = 5 1 , (b, c) = 7 1 . 

The general result is as follows: 

1-5-7. Proposition. If a # 0, b # 0, then 

( a9 ft) = J^mintvpOO.VpW}. 
p 

Proof : After having done the concrete examples, the proof is obvious—but 
we give the details. 

By min{v p (<z), v p (b)} we mean, as is customary, the minimum of the two 
numbers v p (a) 9 v p (b). First of all, we need to know that 

j-j- ^min{vp(a), vp(fr)} 

has meaning—in other words, that it represents an integer. For this, we must 
verify the two conditions 

(0 min{v p (a), v p (b)} > 0 for all/?, 

(«) min{v p (tf), v p (Z>)} = 0 for almost all p. 

Now, (0 is immediate because both v p (a) > 0 and v p (b) > 0 for all /?, and (ii) 
follows from the fact that both v p (a) = 0 and v p (b) = 0 for almost all p. 
Consequently, we may consider the integer 

d /? min { Vp(fl) * 
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and check that it satisfies the requirements for the gcd (as given in the defini¬ 
tion 1-3-1). 

Since v p {d) = min{v p (a), v p (b)} < v p (a) and v p (d) = min {v p (a), v p (b)} < 
v p (b) for all p , it follows from 1-5-6 that d\ a and d\ b. Furthermore, if c \ a and 
c | b , then, according to 1-5-6, v p (c) < v p {a) and v p (c) < v p {b) for all p. There¬ 
fore, 

v p (c) < min {v p (a), v p (b)} = v p (d) for all p. 

and, by another application of 1-5-6, c \ d. Of course, d > 0—which completes 
the proof that d is the gcd of a and b . | 

We turn next to a notion that is analogous (“dual,” is a better word) to 
the greatest common divisor. 

1-5-8. Definition. Suppose a ^ 0 and b # 0 are given. Any integer m 
which satisfies the following condition 

(/) a | m and b | m 

is said to be a common multiple of a and b. If, in addition, m satisfies the 
following two conditions 

(ii) if a | c and b \ c , then m \ c , 

(iii) m > 0, 

then m is said to be a least common multiple (we abbreviate this to 1cm) of 
a and b. 

Expressed in words: A least common multiple of a and b is a common 
multiple which is positive and is also a divisor of every common multiple. 

For any a # 0, b # 0, the element ab is a common multiple, and in fact, so 
is 0. It is easy to see that 2 and 3 have least common multiple 6, and that 4 and 
6 have least common multiple 12. The full story about 1cm is given by our 
next result. 

1-5-9. Theorem. Given a ^ 0, b^ 0, then their least common multiple 

exists and is unique; we denote it by [a, b\. Furthermore, [a, b] is the 

smallest positive common multiple of a and b , and it can be expressed in 

the form 

fc] = II P max {Vp(a) ’ Vp(b)} . 

p 

Proof : Uniqueness is easy, as in the case of gcd. In fact, if both m and m' 
are least common multiples of a and b , then m | rri and rri \ m. Hence m = m\ 
because both are positive. 
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Existence is also easy. Because the proof uses the same techniques as in 
1-5-7, we merely sketch it. It is clear that max{v p (tf), v p (&)} (by which we mean 
the maximum of the two numbers v p (a ), v p (&)) is greater than or equal to 0 
for all p , and equal to 0 for almost all p. Therefore, 

j-j- pinax {Vp(a), Vp(fr)} 

represents an integer—call it m. Of course, m > 0. Since 

v p (a) < max{v p (tf), v p (b)} = v p (m) for all p , 

it follows that a | m; similarly b \ m\ so m is a common multiple of a and b. 
Finally, 


a | c, b | c => v p (a) < v p (c) and v p (b) < v p (c) for all p 

=> v p (m) = ma x{v p (a), v p (b)} < v p (c) for all p 
= >m\c 

which shows that m is a least common multiple—and hence m is the 1 cm. 

Obviously, m is the smallest positive common multiple of a and b . Note 
also that [a , b ] = [ b , a] because everything about 1 cm is symmetric in a 
and b . | 

1-5-10. Example. To illustrate 1-5-7, 1-5-9, and how one operates with 
exponents, suppose 


a = 2 • 3 4 • 19 3 • 

37 2 

• 97 5 , 



= 3 3 • 5 * 11 s - 

19 2 

.796 

■97, 


c =2 • 3 2 • 5 • 11 

• 13 

3 -37 

2 .97 

• 113. 

By working accurately, we see that 





a& = 2-3 7 -5-ll 5 - 

19 s • 

37 2 • 

796 . 

97 6 , 

ac = 2 2 • 3 6 • 5 • 11 • 

13 3 • 

19 3 • 

37 4 • 

97 6 • 


be = 2 • 3 5 • 5 2 • ll 6 • 13 3 • 19 2 • 37 2 • 79 6 • 97 2 • 113, 
0 a , b) = 3 3 • 19 2 • 97, 

0 a , c) = 2 • 3 2 • 37 2 • 97, 

(6 , c) = 3 2 -5 - 11 -97, 

[a, 6] = 2 • 3 4 • 5 • ll 5 • 19 3 • 37 2 • 79 6 • 97 5 , 

[a, c] = 2 • 3 4 • 5 • 11 • 13 3 • 19 3 • 37 2 • 97 5 • 113, 

[i, c] = 2 • 3 3 * 5 • ll 5 • 13 3 • 19 2 • 37 2 • 79 6 • 97 • 113. 



38 


/. ELEMENTARY NUMBER THEORY 


1-5-11. Proposition. If a > 0, b > 0, then 

ab = (a, b)[a , b\. 


Proof: By virtue of 1-5-3, it suffices to show that for every prime p , 
v p (ab) = v p {(a, b)[a, 6]}. 

Since v p {{a, b )} = min{v p (a), v p (b)} and v p {[a, 6]} = max{v p (a), v p (b)}, all that 
has to be shown is 

v P (a) + v p (b) = min{v p (a), v p (b)} + max{v p (a), v p (b)}. 

But this is trivial; because the roles of a and b are interchangeable, we may as¬ 
sume that v p (a) < v p (b ), and then min{v p (a), v p (Z>)} = v p (a), max{v p (a), v,(6)} = 
v p (b). This completes the proof. | 

Essentially the same result holds under the more general hypotheses 
a # 0, b # 0; the left side of the equation must then be replaced by \ab\. 

1-5-12. Exercise. Consider a t # 0, a 2 ^ 0,..., a n # 0, where n > 2. In 
keeping with what has gone before, define a gcd of a i9 a 2 ,..., a n to be a 
number d satisfying the conditions: (i) d\a { for i= 1 ,..., n (that is, d is a 
common divisor of the tf/s); (ii) if c \ a t for i = 1then c \ d (that is, 
if c is a common divisor of the a/ s, then c \ d); (Hi) d > 0. In similar fashion, 
define a 1cm of a l9 ... 9 a m to be a number m satisfying the conditions: 
(/) a t | m for i = 1, ..., n (ii) if a t \ c for i = 1,..., n , then m | c ; (in) m > 0. 

(1) The gcd of a i9 a 2 , ..., a n exists and is unique; in fact, if we denote it 

by (a u a 2 ,---> a n )> then 

min { v p (ai )} 

(a u a 2 ,... ,a n ) = l\p i ~ l ." 

P 

[If (a u a 2 ,..., a„) = 1, we say that a u a 2 ,..., a„ are relatively prime.] 

(2) The 1cm of a u a 2 ,..., a„ exists and is unique; in fact, if we denote it 
by [a u a 2 ,a„], then 

max {vp(fli)} 

[a 1 ,a 2 ,...,aJ = nP i=1 ’-’ n 

P 

(3) The greatest common divisor (a u a l9 ... 9 a n ) is the biggest positive 
divisor of a l9 a 2 ,..., a n . The least common multiple [a l9 a 2 ,..., a n ] is the 
smallest positive multiple of a i9 a 2 ,..., a n . 
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(4) We have 

( a t> a 2 > a $) = (( a i> a i)> ^3)> 

(a i9 a 2 , a 3 , a 4 ) = ((tf*, a 2 , 03 ), 04 .), 

(^l> ^2 » • • • > &n — 1 , ((^l> ^2 > • • • » — 1 )> 

and 

fr*i> ^2 > ^ 3 ] = \\. a u a 2 L # 3 ]? 
fo, a 2> a 3> a *\ = #2 > #3 >L ^ 4 ]) 

^2 > • • • > ^n-l> ^# 1 ] = [[^ 1 > ^2 > • • • > ^n-lL ^n]* 

It is worth elaborating on the significance of these relations. If the factoriza¬ 
tions of a u ..., a n are all known, then, by making use of parts (1) and (2), we 
can easily find (a l9 ..., a n ) and [a i9 ... 9 a n ]. However, if the factorizations of 
a u ..., a n are not known, these relations may be used to determine (a l9 ..., a n ) 
and [a t , ..., a n ]. In more detail: First, we compute (a i9 a 2 ) = d by the methods 
of Section 1-3, and then use the relation \a t a 2 \ = (a l9 a 2 )[a u a 2 ] to find 
[a u a 2 ] = 772 . Once the procedure for computing the gcd and 1cm of two integers 
is known, our relations call for repeating the process as many times as neces¬ 
sary. Thus, (a u a 2 ,a 3 ) is found by computing ((a i9 a 2 ), a 3 ) = (d, a 3 ), and 
[^,^ 2 ,^ 3 ] is found by computing [[a l9 a 2 ], a 3 ] — [m 9 a 3 ], and so on. Of 
course, it is a tedious process, but it works. 

Note that in either case, it does not matter in what order the numbers 
a u a 29 ... 9 a n are listed. 

(5) The gcd (a l9 ... 9 a n ) is the smallest positive integer which can be 
expressed as a linear combination of a i9 ..., a n ; notationally, 

(a i9 ..., a n ) = min^^ + x 2 a 2 + • • • + x n a n > 01 x i9 x 2 ,..., e Z}. 

1-5-13 / PROBLEMS 

1. Factor the numbers a = 28,050 and b = 9555, then use the factorizations 
to find (a, b) and [<z, b\. 

2. Find the least common multiple of all integers from 1 through 15. 

3. Suppose a = 7200, b = 9996, c = 80,465. Find 

(0 (a, b\ (a, c) 9 0 b , c), 

0*0 [a 9 b] 9 [a, cl [b 9 c] 9 
(Hi) (a, Z>, c), [ a , Z>, c], 

(iv) (a, [b, c]), [a, (b 9 c)] 9 [(a 9 b) 9 (a, c)] 9 ([a, b], [a, c]). 
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4. Find the least common multiple of a = 5311, b = 7571. Can you express 
it as a linear combination of a and bl 

5. Suppose a = 24,523, b = 31,373, c = 44,929. Without factoring (which 
would take too long), find 

(0 ( a, b) and (a, b , c), 

(«) [a, 6 ] and [a , 6 , c], 

(«0 [ 0 , 6 ), c] and ([a, b], c). 

6 . Discuss the least common multiple of a and b when one of them is 0. 

7. Let x, y, z be any real numbers. Show that: 

(0 min{max{x, y }, zj = max{min{x, z}, min{y, z}}, 

(«) max{min{x, >>}, z} = min{max{x, z}, maxf.y, z}}, 

(m) min{max{x, y}, max{x, z}, max{y, z}} 

= max (min {x, j;}, min{x, z}, min{y, z}}, 

(w) max{x, y, z} + min{x + y, x + z, y + z} = x + y + z. 

8 . If a is a positive integer, what are (a, a + 1) and [a, a + 1]? 

9. If a and b are positive integers with a | b , what are (a, &) and [a, b\ ? 

10. For positive integers a and 6 , show that 
(0 O, 6) = [a, b] o a = b , 

(w) [a, 6 ] = ab o {a, b) = 1 . 

11. Show that if a u a 2 ,..., are relatively prime in pairs (meaning that any 
two are relatively prime), then they are relatively prime. 

12. By making use of the v p ’s prove that if m > 0 and a, b are arbitrary, then 

(ma, mb) = m(a, b). 

What if m is an arbitrary integer not equal to 0 ? 

13. Prove in two different ways that if m > 0, then 

[ma, mb] = m[a, b] 

for all a , b. What happens if m is an arbitrary integer not equal to 0? 

14. Suppose a and b are positive with (<a , b) = 1, then ab is a perfect square <=> 
both a and b are perfect squares. Even more, for any n > 2, ab is an nth 
power o both a and b are nth powers. 

15. Suppose we define v p (0) = oo for all primes p , and consider oo as bigger 
than any integer. Show that for all a, be Z 

v p (a + b)> min{v p (a), v p (Z>)}. 
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Moreover, if v p (a) # v p (b ), then 

v p (a + b) = min {v p (a\ v p (b)}. 

16. This problem gives a general formulation for a type of argument that 
occurs more than once in the text. 

(/) Suppose that for each p we are given an integer m p > 0 and that 
m p = 0 for almost all p , then the infinite product 

Up mp 

represents a positive integer. 

(ii) There exists a positive integer a such that 

v p (a) = m p for all p. 

(iii) Suppose, in addition, that for each p we have an integer n p > 0 such 
that almost all n p = 0. Show that if 

Up mp = Up np , 
then m p = n p for all p . 

17. Decide if the following assertions are true or false: if true, give a proof; 
if false, give a counterexample. 

(0 0, b) = (a, c) => [ a , 6] = [a, c\. 

(ii) (a, 6) = d=>(a 2 , b 2 ) = d 2 . 

(iii) (a, b) = (a 9 c ) => (a 2 , b 2 ) = (a 2 , c 2 ). 

(iv) (a, b) = (a, c) => (ia , Z>) = (a, b, c ). 

(y) (a, Z>) = 1 => (a 2 , ab , Z> 2 ) = 1. 

(w) [a 2 , tfZ>, Z> 2 ] = [a 2 , b 2 ]. 

(vii) b\(a 2 — 1) => b \ (a 4 — 1). 

(viii) b j (a 2 + 1) => b \ (a 4 + 1). 

(ix) a 3 \b 3 => a\b. 

(x) a 3 \b 2 =>a\b. 

(xi) a 2 | b 3 => a | b. 

(xii) (a, Z>, c) = ((a , Z>), (a, c)). 

18. (/) Show that 

([a, Z>], c) = [(a, c), (Z>, c)], 

[(a, 6), c] = ([a 9 c], [Z>, c]). 

(ii) Even more, show inductively that 

([# 1 , a 2 ,..., #„) = [(# 1 , ..., #«)]> 

[(«!, ^2.a»-i), a n] = ([<*1, a n l • •, fo-1, *„])• 



42 


/. ELEMENTARY NUMBER THEORY 


19. In terms of the v p ’s, how can one decide if ( a , b) = 1 ? 

20. Show that the positive integers a i9 a 2 >>.. 9 a„ are relatively prime in 
pairs a t a 2 • • • a n = [a i9 a l9 ... 9 a n ]. 

21. Use the v p ’s to do 1-4-8, Problem 10. 


1-6. Linear Diophantine Equations 

Given integers a ^ 0, b # 0, and c 9 we want to solve the equation 

ax + by = c. 

We must first settle the question of what is meant by a solution. Graphing this 
equation in the plane gives a straight line, and the pair of coordinates x, y of 
any point on this line satisfy the equation. From this point of view, there are 
an infinite number of solutions; for every choice of x we can solve for the 
corresponding y. However, this is not what we have in mind here. Our 
concern is with solutions where both x and y are integers. Geometrically, this 
means that we want to know if the straight line whose equation is ax + by — c 
passes through any points both of whose coordinates are integers (such a 
point is often referred to as an integral point). 

Thus, by a solution of ax + by = c we shall mean a pair of integers 
{x 0 , y 0 } for which ax 0 + by 0 = c. The equation is known as a linear dio¬ 
phantine equation—“linear” signifies that the unknowns x and y appear only 
to the first power, “ diophantine ” signifies that we are concerned solely with 
integral solutions (the name derives from that of Diophantus of Alexandria 
who considered the problem of finding integral solutions of equations during 
the third or fourth century a.d.). 

In connection with the linear diophantine equation ax + by = c there are 
several questions about its solutions that we can ask. Does a solution exist? 
Is there a simple criterion for deciding if a solution exists ? If there is a solution, 
how many solutions are there? Can we find all the solutions explicitly? A 
strictly geometric approach to these questions is not satisfactory. It lacks 
precision and the ingredients necessary for giving proofs. For example, by 
drawing the line ax + by = c we can not always decide if it passes through 
an integral point. However, the algebraic approach (with a geometric 
“picture” kept at the back of one’s mind) provides complete and definitive 
answers to all our questions. 


1-6-1. Theorem. Given integers a ^ 0, b # 0, and c 9 let d = (a 9 b ). Then 
the linear diophantine equation ax + by = c has a solution o d\c. 
Moreover, in the case where a solution exists, 
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(0 We can find an explicit solution {x 0 , y 0 } by use of the Euclidean 
algorithm. 

(ii) There are an infinite number of solutions. 

(iii) If {x 0 , y 0 } is any known solution, then all solutions are given by the 
pairs (x, y} where 


?-»>-<($■ 


t — 0, +1, + 2, +3,.. . 


Proof: According to 1-3-6, we know that the set 

S = {ax + by | x, y e Z} = {xa + yb | x, y e Z} 
of all linear combinations of a and b is equal to the set 

T={nd\ne Z} 

of all multiples of d = (a, b). Therefore, 

ax + by = c has a solution o c is a linear combination of a and b , 

o c e S, 
o c e T, 

<=> d | c, 

which proves the first statement. 

In this situation, that is, when (a , b) = d divides c, let us write 
a = da\ b = db\ c = 

We can find an explicit solution {x 0 , y 0 } of ax + by = c as follows. Since 
d = (<z, 6), d can be expressed as a linear combination of a and in fact, as 
seen in 1-3-2, the Euclidean algorithm may be used to find such a linear com¬ 
bination. In this way, we locate integers x', y f such that 

ax' + by' = d. 

(Note that in the case a | b —where, for all practical purposes, the Euclidean 
algorithm breaks down—we can still locate x' and y'; namely, because here 
d=\a\, we may take y' = 0, x' = ± 1, with the sign of x' depending on whether 
a is positive or negative.) Then multiplying through by c' gives 

ac'x' + bc'y' = dc' = c, 

which says that for x 0 = c'x\ y 0 = c'y' the pair {x 0 , y 0 } is a solution of 
ax + by = c. This proves (/). 
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To prove (ii) suppose {x 0 , y 0 } is any solution; it does not have to be the 
solution we found above. For any integer t , let us put 

x = x 0 + tb\ y = y 0 - ta'. 

It is easy to check, by substituting in the diophantine equation, that {x 9 y) is a 
solution—in detail, 

ax 4* by = a(x 0 -4 tb') 4- b(y 0 — ta') 

= ax 0 4- by 0 + t(ab' — ba') 

= c 4- t(da'b' — db'a') 

= c. 

[Note that the choice of x and y is quite natural; it is based on the observation 
that ab' + b( — a') = 0.] Consequently, we have an infinite number of solutions, 
one for each choice of t e Z. 

To prove (m), it remains to show that if a solution {x 0 , y 0 } is known and 
{x i, y x } is an arbitrary solution, then there exists an integer t such that 

= *o + tb\ Ti = To “ ta'. 

(Note that in the statement of the theorem we found it convenient to write 
ajd for a' and bjd for b' 9 rather than introduce the additional elements a' and 
b'. Of course, the properties of fractions are never used in the proof, nor do 
the fractions a/d 9 bjd appear.) Because {x 0 , y 0 } and {x l9 y t } are solutions, we 
have ax 0 + by 0 = c — ax t + by t which implies 

a(x i - x 0 ) = b(y 0 - y x ). 

Recalling that a = da\ b — db' 9 we obtain 

a'(x i - x 0 ) = b'(y 0 - y x ). 

Now, according to 1-3-8, (a\ b') = 1, so it follows from 1-3-10, part (ii) that 
a'\{y 0 —y x ). Thus, there exists te Z with ta'= y 0 — y v Substituting this 
back gives 

a'(x t — x 0 ) = b'ta'. 

We conclude that x t = x 0 + tb' and y x = y 0 - ta' 9 which completes the 
proof. | 

The restrictions a # 0, b ^ 0 were included in the statement of the theorem, 
because if one of them is 0 our linear diophantine equation becomes ax = c or 
by = c —and we have no interest in these equations because they are old hat, 
involving nothing more than questions of divisibility. The reader will note 
that if one of a or b is 0 the assertions of the theorem still hold (except for the 
reference to the Euclidean algorithm). 
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Of course, it is quite possible that c = 0. In this situation, everything goes 
quickly. One solution of ax -f by = 0 is x 0 = b,y 0 = —a, and all the solutions 
are given by 



7 = 0 , + 1 , ± 2 , .... 


1-6-2. Example. Consider the linear diophantine equation 

258x + 354y= 18. (*) 

In order to solve this equation, we pattern our discussion on the proof of 
1-6-1 with a = 258, b = 354, c = 18. 

To decide if a solution exists, we must check if d— (258, 354) divides 
c = 18. It is easy to see that (258, 354) = 6 (in fact, this was done in 1-3-3), so 
a solution exists. 

To find an explicit solution {x 0 , y 0 }, we begin by expressing d = 6 as a 
linear combination of a = 258 and b = 354 via the Euclidean algorithm. This 
has already been done in 1-3-7—with the result 

(11)(258) + ( —8)(354) = 6. 

(In the notation of the proof of 1-6-1, this means that x' = 11, y* = — 8.) 
Multiplying this equation by 3 (of course, 3 = d/c = c') we have 

(33)(258) + (—24)(354) = 18, 

so x 0 = 33, y 0 = —24 is a solution of (*). 

It remains to list all the solutions of (*). Using the fact that a' = ajd = 
258/6 = 43 and V = bjd = 343/6 = 59, it follows that all solutions of (*) are 
given by 


x = 33 + 597, 
y= -24-437, 


7 = 0, +1, ± 2,.... 


Once this example is done, there are a number of related linear diophantine 
equations which can be solved with a minimum of effort. Let us list a few. 

(/) Consider the equation 

258x — 354y =18. 

Here, a = 258, b = —354, c = 18. From previous results we know that d = 6, 
and 


(33)(258) + (24)(—354) = 18. 
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Therefore, x 0 = 33, y 0 = 24 is a solution, and all solutions are given by 


x = 33 — 59/, 
y = 24- 43/, 

(ii) Consider the equation 


/ = 0 , + 1 , + 2 ,.... 


354x — 25Sy =18. 


In this case, according to the way our notation is set up, a = 354, b = —258, 
c = 18, d = (354, —258) = 6, <z/d = 59, bjd = —43. The previous expression 
for 18 can be rewritten in the form 


(—24)(354) + (—33)( — 258) = 18. 

Hence, x 0 = — 24, y 0 = — 33 is a solution, and all solutions are given by 


x = -24-43/, 
y= -33 - 59/, 


/ = 0, +1, +2, + 3,.... 


Note that here, as elsewhere, we can change signs, because / runs over Z, and 
hence write all solutions in the form 


x = -24 + 43/, 
y= -33 + 59/, 


/ = 0, +1, +2, +3,.... 


(iii) The linear diophantine equation 


354x + 258j> = 35 

has no solution, because d = (a, b) = (354, 258) = 6 does not divide c = 35. 

(iv) Consider the linear diophantine equation 


354x + 258 y » 36. 


Here a = 354, b = 258, c = 36, d = 6. Since 61 36, a solution exists. Starting 
from the relation 


( —8)(354) + (ll)(258) = 6, 
which was proved earlier, it follows that 

(—48)(354) + (66)(258) = 36. 

Thus, x 0 = —48, = 66 is a solution, and all solutions are of form 

x = — 48 + 43/, y 

y = 66 - 59/, 


1-6-3. Euler’s Method. There is another, completely elementary, method 
for finding a solution of an explicit linear diophantine equation. We illustrate 
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the method, which is due to Euler (1707-1783), by finding a solution to the 
same equation 


258* + 354y = 18 (*) 

that was discussed in 1-6-2. 

The objective is to find integers x and y for which 258x + 354y =18. 
Rewriting this equation, we have (using our knowledge of fractions) 

18 — 354y 18-96 y 

X ~ 258 “ 258 y ' 


If we put 


z = 


18 - 9 6y 
~258 : 


then we are looking for an integer y for which z is an integer; as then x will 
be an integer and {x, y} will be a solution of (*). Now consider the expression 
for z and rewrite it as 258z = 18 — 96y, which leads to 

18 - 258z 18 + 30z „ 

y =-—-= ——-3z. 


96 

(Note that one may also write 


96 


18 - 66z „ 

y -— 6 -2z, 


and proceed as we shall do in a moment; we have chosen to use the multiple 
of 96 closest to —258.) Thus, we seek an integer z for which 

18 + 30 z 


is an integer. Rewriting this as 

96w - 18 


z = ■ 


30 


— 3w — 1 ~f- 


6w + 12 
30 


we seek an integer w for which 


t = 


6w + 12 w + 2 
30 = 5 


is an integer. But now the numbers are small, and it is trivial to find a w of the 
desired type—for example, w = —2, 3, 8, —7,_Once a w is chosen for 
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which t is an integer, we proceed through the relations 

z = 3w — 1 + /, 
y = w — 3z, 
x = z — y 9 


in order to find x and y. 

In particular, starting with w = — 2, we obtain t = 0, z = — 7, y = 19, 
x = — 26; so { — 26, 19} is a solution of (*), and according to 1 -6-1 all solutions 
are of form 


x = —26 + 59 /, _ 

m a -a, fe Z. 

y = 19 - 43/, 

Of course, this is the same set of solutions that was found in 1-6-2. The differ¬ 
ence in appearance is explained by the fact that we have two distinct “ para- 
metrizations ” of the same set. 

Similarly, if we start with w = 3, then t = 1, z = 9, y — — 24, x = 33—so, 
in this case, we obtain the solution {33, —24}, and the set of all solutions 
becomes identical in appearance (that is, in notation) with the set of solutions 
found in 1-6-2. 

In all this, we used Euler’s method to find a particular solution of the 
linear diophantine equation and, in virtue of 1-6-1 and the fact that the gcd 
d = 6 was known, were then able to list all the solutions. However, only slight 
modifications in the organization of Euler’s method are needed to find all 
solutions directly—that is, without finding d or using 1-6-1. In detail, we pro¬ 
ceed as before until we arrive at the relation 

w -f" 2 


Rewriting this in the form w = 5t — 2, we have, therefore, the system of 
relations 


w = 5/ — 2, 
z = 3w — 1 + /, 
y = w — 3z, 

X = z - y, 

in which all the unknowns /, w, z, y, and x are to be integers. Now one may 
obviously view te Z as a “parameter” and express the other variables— 
especially x and y — in terms of /. In particular, if t = 0, then w = — 2, z = — 7, 
y = 19, x = -26 so { — 26, 19} is a solution, and if t = 1, then w = 3, z = 9, 
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y = — 24, * = 33 so {33, —24} is a solution. In general, we have 
w — St — 2, 

z = 3(5/ — 2) — 1 + / = 16/-7, 
y — (5/ — 2) — 3(16* - 7) = -43/ + 19, 
x = (16/ - 7) - (-43/ + 19) = 59/ - 26, 

so it follows that 


x = —26 + 59/, 
y= 19-43/, 


is the set of all solutions! 


te Z 


1-6-4 / PROBLEMS 


1. Find all solutions of the following linear diophantine equations: 


(i) 91* + 33 y = 147, 

(iii) 30* — 43 y = 97, 

(y) 17* + 646y = 51, 

(vii) 874* - 19.y = 1052, 
(ix) 4147* + 10,672j> = 58. 


(ii) 24* + 30 y = 14, 
(iv) 93* — Sly = IS, 
(vi) 91* + 5 6y = 0, 
(yin) 84* — 91 y =11, 


2. Solve the equations of Problem (1) by use of Euler’s method. 


3. Find all positive solutions (meaning solutions {*, y} with * > 0, y > 0) of 
(i) 18* + ly = 302, (ii) 18* — ly = 302, 

(iii) 54* — 38 y = 82, (iv) 11*+ 13 y = 47, 

(v) 10* + 28 y = 1240. 

A geometric picture is helpful here. 


4. Show that there exist no integers a and b such that (a, b) = 7 and a + b = 
100. On the other hand, there exist an infinite number of pairs of integers 
a and b such that (a, b) — 5 and a + b = 100. 


5. Which of the following sets are the same? 

(/) {\1 + \51t\te Z}, (ii) {1744 + 157?|/e Z}, 

(hi) { - 768 + 1571 1 1 e Z}, (ip) {100 — 157? 1 1 e Z}, 

(v) {51 - 157? 1 1 e Z}, (vi) {-57+ 157+e Z}. 

6. Find two fractions having 5 and 7 for denominators whose sum is equal 
to 26/35. 


7. Find a number that leaves the remainder 16 when divided by 39 and the 
remainder 27 when divided by 56. 



50 


/. ELEMENTARY NUMBER THEORY 


8. Show that a = 14/ +3 and b = 21/ + 1 are relatively prime for each 
choice of / e Z. 

9. If the diophantine equation ax 4* by = c has no solutions (that is, when 

(< a , then Euler’s method must break down somewhere. How does 

this occur? 

10. Use Euler’s method to solve 

(i) 6x - lOy + 15z = 2, (ii) 28* + 24 y - 92 z = 202. 

11. Solve the simultaneous linear diophantine equations 

Sx + 5y = — 7, 

I5x — I2y = 393. 

12. Four men and a monkey spent the day on a tropical island collecting 
coconuts. At night, while the others slept, one of the men arose and 
divided the nuts into four equal piles. There was one coconut left over, so 
he gave it to the monkey. He then hid his share, put the rest of the nuts 
into a single pile, and went back to sleep. In turn, each of the other men 
went through the same procedure, and each time there was one nut for the 
monkey. In the morning, the four men divided the remaining coconuts 
equally, and again there was one left over for the monkey. What is the 
smallest number of nuts which could have been in the original pile ? 

13. Do the “coconut problem” when there are 
(f) 3 men, (ii) 5 men. 

14. Suppose in the original coconut problem that the fourth man, after taking 
his share and giving one to the monkey, leaves the island. In the morning, 
the three remaining men give one nut to the monkey (as usual) and take 
equal shares. What happens ? 

15. If a and b are relatively prime positive integers, show that the diophantine 
equation 

ax —by — c 

has an infinite number of positive solutions. 

1-7. Congruence 

The notion of congruence was introduced by Gauss (1777-1855; a leading 
candidate for the title of greatest mathematician of all time), and it has 
turned out to be very fruitful. In this section, we merely make the definition, 
and point out a few of its immediate consequences. A detailed study of con¬ 
gruences will be undertaken in Chapter III. 
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1-7-1. Definition. Fix an integer m> 0. For any a, be Z we write 

a = b (mod m) 

and read this as “a is congruent to b modulo mf when m divides a — b. 

For example, 3 = — 4 (mod 7), 76 = 21 (mod 11), 2 U = 1 (mod 23). 

1-7-2. Properties. Congruence modulo m behaves like equality in many 
ways. The key properties of congruence which parallel properties of equality 
are the following. 

(/) a = a (mod m) for all a. 

This is known as the reflexive property; it is trivial since m divides {a — a) = 0. 

(//) If a = b (mod m), then b = a (mod m). 

This is known as the symmetric property of congruence. It requires proof 
because the definition distinguishes between the left side and the right side 
of the “ = ” sign. Of course, the proof is trivial since m \ {a — b) implies 
m\(b — a). 

(iii) If a = b (mod m) and b = c (mod m), then a = c (mod m). 

This is known as the transitive property of congruence. It holds because 
m | {a — b) and m \ (b — c) together imply that m divides (a — b) + (b — c) = 
(a - c). 


1-7-3. Proposition. For a, b e Z, a = b (mod m) o both a and b have 
the same remainder upon division by m. 


Proof: The division algorithm gives unique expressions 

a = q t m + r i9 0<r t < m, 
b =q 2 m + r 2 , 0 <r 2 <m, 

and upon subtraction 

a-b = {q l - q 2 )m + (r x - r 2 ). (*) 

Since r t and r 2 are the remainders upon division by m 9 we must show that 
a = b (mod m) o r x = r 2 . 

If a = b (mod m) 9 then m\{a — b) 9 and from (*) we see that m\(r 1 — r 2 ). 
Because of the bounds on r t and r 2 , it follows that r i = r 2 . Conversely, if 
r x —r l9 then (*) implies m \ {a — b) 9 which says that a = b (mod m). | 
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This criterion for when two integers are congruent could have been taken 
as the definition for congruence at the start. In other words, the definition 
would be that two numbers are congruent (mod m) when their remainders 
(upon division by m) are equal. Of course, it does not matter which definition 
is given initially; the important thing is that 

a = b (mod m) o m \ (a — b) <=> a and b have equal remainders. 

The remainder criterion for congruence is often convenient for use in a proof; 
for example, it makes it crystal clear that congruence mod m satisfies the 
reflexive, symmetric, and transitive properties. 

Although the definition of congruence was given for any modulus m > 0, 
we really have no interest in the case m — 1—for any two integers are con¬ 
gruent (mod 1), and there is no information about integers that could possibly 
be derived from this congruence. 

Pursuing the analogy between equality and congruence that was pointed 
up in 1-7-2, we recall that for integers, adding equals to equals gives equals and 
multiplying equals by equals gives equals. It is a central fact about congruences 
that the corresponding statements hold for them also. More precisely, we 
have: 


1-7-4. Proposition. If a = b (mod m) and a' = b' (mod m ), then 
a + a' = b -f V (mod m) and aa ' = bb' (mod m). 


Proof : The hypotheses say that m \ (a — b) and m \ (a' — b '). It follows that 
m divides the sum (a - b) + (a' - b ') = (a + a') - (b + b '), which means that 
a + a' = b + V (mod m). 

As for the second part, we wish to show that 

m | {aa! — bb '). 

The way to accomplish this is to make use of a fairly common “trick”— 
which involves adding and subtracting a well-chosen term. Namely, 


aa! — bb' — aa! — ba' + ba ' — bb' = {a — b)a' + b{a' — b'), 


and since m divides both {a - b) and {a' - b'\ it follows that m divides 
aa' — bb'. This completes the proof. | 

This result says, for example, that 7 = 53 (mod 23) and 18 = —5 (mod 23) 
together imply 7 + 18 = 53 — 5 (mod 23) and (7)(18) = (53)( — 5) (mod 23). 
Naturally, one may also check directly that 25 = 48 (mod 23) and 126 = 
-265 (mod 23). 
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Since we now have the “right” to add and multiply two congruences 
mod m , the process may be repeated for more than two congruences, and the 
following statements are clearly valid. 


1-7-5. Corollary. If we have several congruences 

a t = b t (mod m), i = 1,..., r, 

then 

r r 

Z a i = Z h t ( mod m ) 

i = 1 i= 1 

and 

r r 

n = n bi (mod m) 

; = i i =l 

In particular, if a = ^ = <z 2 = * * * = anc ^ b = b t = b 2 = • • • = b r , these 
assertions take the following form: 

If a = b (mod m), then ra = rb (mod m) and a r = b r (mod m) for every 
positive integer r. 


1-7-6. Example. Congruences can sometimes be used as labor-saving 
devices in certain types of computations. As an example, consider the follow¬ 
ing question: What is the remainder when 2 50 is divided by 7? According to 
theory, we can write 

2 50 =tf *7 + r, 0 < r < 7, 

but it would be foolish to compute 2 50 and then divide by 7. Instead, we 
observe from this equation that the remainder r satisfies 

2 50 = r (mod 7) 

so our problem is really to find the unique integer r with 0 < r < 1 which is 
congruent to 2 50 modulo 7. For this, we may start with 

2 1 = 2 (mod 7), 2 2 = 4 (mod 7), 2 3 = 1 (mod 7). 

Raising the last congruence to the 16th power gives 2 48 = 1 (mod 7); then 
multiplying this by 2 2 , we arrive at 

2 50 = 4 (mod 7), 

so the remainder is 4. 

Let us consider another example in the same vein: Is 2 30 • 14 50 — 1 
divisible by 11 ? We treat the powers of 2 and 14 separately. Starting with 
2 2 = 4 (mod 11) and 2 3 = 8 = — 3 (mod 11), we see that 

2 5 = 2 2 • 2 3 es (4)( —3) = —12 = — 1 (mod 11). 
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Consequently, raising this congruence to the 6th power, 

2 30 = (2 5 ) 6 = (-1) 6 = 1 (mod 11). 

Furthermore, 14 = 3 (mod 11) so 14 50 = 3 50 (mod 11). Now3 2 = — 2 (mod 11) 
so 3 4 = 4(mod 11), and by multiplying these two 3 6 = —8 = 3 (mod 11). 
Therefore, 

3 10 = 3 6 • 3 4 = 3 • 4 = 1 (mod 11) 

and then 

14 50 = 3 50 = 1 (mod 11). 

We conclude that 2 30 • 14 50 — 1 = (1)(1) — 1 = 0 (mod 11)—which says that 
the given number is divisible by 11. 

Needless to say, other arrangements of the foregoing computations are 
possible; they are all equally valid, so long as the amount of work is kept under 
control. 

1-7-7. Application. A much more interesting application of congruences 
than the preceding one involves find a criterion for divisibility by 11. For this, 
it is necessary to recall the underlying meaning of the standard notation for 
integers. If we write, for example, the number 728, this is simply a compact 
notation for 

7(10) 2 + 2(10) + 8 = 8 + 2(10) + 7(10) 2 
and by the same token 31,059 is “shorthand” for 
9 + 5(10) + 0(10) 2 + 1(10) 3 + 3(10) 4 = 3(10) 4 + 1(10) 3 + 0(10) 2 + 5(10) + 9 
In general, we write 

a = a n a n -1 ■ ■ ■ a 2 a i a 0 9 

where 0 < a t < 9 for i = 0, 1,..., n and a n # 0, to signify that 

a = a„(10)" + a^ClO)"- 1 + • • • + a 2 (10) 2 + ^(10) + a 0 . 

[It is often convenient to write this in ascending powers of 10; for example, if 
a = 306,394,230,450 it is easier to start with a = 0 + 5(10) + • • • and work 
one’s way up, than to start at the front with Sand locate the power of 10 that 
must be attached to 3.] This is simply another version of the “ well-known 
fact” (more about this in Section 1-8) that we write integers using only the 
digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and with the front digit not equal to 0. 

Turning to the powers of 10, we note that 10 = —1 (mod 11); squaring 
this congruence gives (10) 2 = 1 (mod 11), while cubing it gives (10) 3 s 
—1 (mod 11). The general rule is obviously 

(10) odd = -1 (mod 11), (10) even = 1 (mod 11). 
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According to the rule for multiplication of congruences, 

^(10) = — a t (mod 11), a 2 {\0) 2 = a 2 (mod 11), <z 3 (10) 3 = —a 3 (mod 11), 

and so on. Now by the rule for adding several congruences, 1-7-5. we see that 

a = a 0 4* tfi(10) + a 2 ( 10) 2 + • • • + tf„(10) n 

= a 0 - a t + a 2 + • • • + (-1 ) n a n (mod 11). 

This says that a and the alternating sum of the digits, 

n 

°0 — a l + a 2 — a 3 + • • • + ( — l)" a n = z (-iy«„ 

i = 0 

are congruent to each other mod 11; so, in particular, by 1-7-3, they have the 
same remainder upon division by 11. Consequently, we have shown 

ll|aoll | fi-iya, 

i = 0 

or in words, an integer a is divisible by 11 o the alternating sum of its digits 
is divisible by 11. 

As an illustration, consider a = 31,059; since the alternating sum of the 
digits is 9 — 5 + 0 — 1+3 = 6, we conclude that 31,059 is not divisible by 
11. In similar fashion, the reader may verify that 11 divides 306,394,230,450. 
With regard to the alternating sum, it should be noted that 

11 \(a 0 -at + ••• + (-1 fa n ) o 11 | -(a 0 - a t + ••• + (-1 fa n ). 

Since exactly one of (a 0 — a t +-1- ( — 1 ) n a n ) and — (a 0 — a t + •••+(— 1 ) n a n ) 

contains the term +a n , it follows that in taking the alternating sum there is no 
harm (as far as the question of divisibility by 11 is concerned) in starting from 
+ a n and then alternating signs as one moves through the digits of a from left 
to right. For example, if a = 306,394,230,450 we take the alternating sum 
3 — 0 + 6 — 3 + 9 — 4 + 2 — 3+0 — 4 + 5 — 0; since this equals 11, a is 
divisible by 11. 

1-7-8. Definition. Given any modulus m , let us introduce the following 
notation; for any ae Z we write 

\a\ m = {be Z | b = a (mod m)}. 

In words, \_a] m denotes the set of all integers which are congruent to a (mod m ); 
so it is appropriate to call the congruence class of a (mod m). 

Since two integers are congruent modulo m o they have the same 
remainder upon division by m , we may view \_a} m in another way—namely, 
is the set of all integers which have the same remainder as a upon division 
by m. This point of view underlies the fact that |_«J m is also referred to as the 
residue class of a (mod m). 
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What does a residue class (modm) look like in a concrete situation? 
Suppose, for example, m = 7, and consider |_3j 7 . We want to decide which 
integers belong to the set [_3_| 7 . According to the definition, b e |_3j 7 o b = 
3 (mod 7), so the way to decide if b e [3j 7 is to check if b = 3 (mod 7). Thus, 
3 e 1_3J 7 because 3 = 3 (mod 7), 10 e [_3j 7 because 10 = 3 (mod 7), 14 £ 3 | 7 
because 14^3 (mod 7) — 4 e [_3j 7 because —4 = 3 (mod 7), —lie 3 | 7 
because —11=3 (mod 7), and so on. By now, it is more or less clear that 

[3J 7 = 18, -11, -4, 3, 10, 17, 24,...}. 

In the same way, the reader may convince himself that 

[121 7 = {..., -16, -9, -2, 5,12,19, 26, 33,...}. 

These examples seem to say that to describe [«] 7 one starts with a and keeps 
adding or subtracting 7. Indeed, we have the following general result which 
settles the question. 

1-7-9. Proposition. For any a e Z, the residue class |_£_| m consists of all 
integers that arise by adding a multiple of m to a —in symbols, 

[*J m = ^ + tm I 1 6 Z *’ 

Proof : Making use of the definitions of residue class, congruence and 
divisibility, we have 

b e [oj m <=> b = a (mod m) 

o m\(b — a) 

o b — a — tm , for some t e Z 
<=> b = a + tm, for some t e Z 
o b e {a + tm 1 1 e Z}. 

This does it. | 

In virtue of this result we can write down residue classes (mod 7) at will; 
for example, 

|_5j 7 = {..., -23, -16, -9,-2, 5, 12, 19, 26, 33,.. 

|-1|, = {.... -22, -15, -8, -1,6, 13,20, 27, 34,...}, 

|6j 7 = {..., -22, -15, -8, -1,6,13,20, 27, 34,...}, 

[0j 7 = {..., -28, -21, -14, -7,0, 7, 14, 21,28,...}, 

|jj 7 = -27, -20, -13, -6,1,8,15,22, 29,...}, 
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l7j 7 = -28, -21, -14, -7, 0, 7, 14, 21, 28,...}, 

115 | 7 = {..., -20, -13, -6, 1, 8, 15, 22, 29, 36, 43,.. 

These numerical examples suggest various properties of congruence 
classes; some of the significant ones are given in the next result. 

1-7-10. Proposition. The residue classes mod m satisfy the following 
properties: 

(0 a e for all a e Z, 

(») be\aj m o ae\b] m , 

(Hi) be\a] m ^[bj m = \aj m , 

(iv) \a\ m n [*J m # 0 => \a\ m = \b] m . 


Proof: Our proof will be based on the interpretation of \o _| m as the set of 
all integers which have the same remainder as a upon division by m; the opera¬ 
tive form of this is 


be \a\ 


m 


o b has the same remainder as a. 


Now, (0 is trivial because a has the same remainder as a. In words, (7) may 
be stated as: Any integer belongs to the residue class it determines. 

To prove (77), one simply observes that b has the same remainder as a o a 
has the same remainder as b . In words, (77) says that b belongs to the residue 
class of a o a belongs to the residue class of b . 

As for {in), if b has the same remainder as a , then any number that has the 
same remainder as b has the same remainder as a (which implies \_b\ m a \ a | m ) 
and conversely (which implies \ a] m c= [b_\ m ); hence, [a_\ m = |_H m . Thus (iii) 
tells us that any element (that is, b) of a residue class (that is, [«J m ) determines 
the residue class. 

Finally, (iv) says that if two residue classes have an element in common 
(other ways to express this are: have nonempty intersection, meet, intersect, 
are not disjoint) they are identical. From this we conclude that two residue 
classes are disjoint or else they are identical. 

To prove (iv), suppose [a J m n \_b\ m # 0, so there exists an integer c in 
the intersection; then 


c e 




•CG 


|oJ m and ce\b\ m 

= and Ldm = l^Jm> 


(by iii) 


This completes the proof. | 
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1-7-11. Corollary. For a, be Z the following are equivalent: 

(1) a = b (mod m), 

(2) a and b have the same remainder upon division by m , 

(3) ae\b} m , 

( 5 ) 

Proof : We already know that (1) <=> (2), (2) (3), (3) o (4), (4) (5), 

so there is nothing to prove. | 

It is sometimes convenient to make use of the contrapositive form of the 
equivalence of (5), (1), and (2); namely, for a, be Z, \a] m # \_b\ m o a ^ 
b (mod m) o a and b have different remainders upon division by m. Of 
course, these equivalences do not require proof. 

Let us illustrate some of the consequences of these results. Suppose we 
take m = l and consider the residue classes 

121,. LU,. [I],. lib 111,. Ill,- Ill,- 

No two of these residue classes are the same because no two of the integers 
0, 1,2, 3, 4, 5, 6 have the same remainder upon division by 7; so these seven 
residue classes are distinct. Now, consider any integer a . By the division algo¬ 
rithm, a = q *7 + r where 0 < r < 7, and therefore 1_^J 7 = |_rj 7 which is one 
of the 7 residue classes already listed. We conclude that there are exactly 7 
residue classes modulo 7, and any integer belongs to one (in fact, because resi¬ 
due classes are distinct, to exactly one) residue class. Note that because a 
residue class has many “names” (for example, [0j 7 = [_7j 7 = [ 14 | 7 = | 211 7 ) 
the 7 residue classes can be denoted in various ways; for example, 

mi,, llil,. b£],. 1U,. bill,. HU,. [22J, 

is a perfectly valid way to list the 7 residue classes. 

It is clear that the same arguments and results apply for an arbitrary 
modulus m , and we may summarize as follows: 

1-7-12. Theorem. For any integer m > 0, Z decomposes (that is, breaks 

up) into exactly m disjoint residue classes (mod m) —namely, 

121.. LU-. lil-. I-- 1 ! .- 


Every integer belongs to exactly one residue class, and every residue class 
contains an infinite number of integers. 
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Proof : By now this is obvious—so the details may be left to the reader. | 

1-7-13. Remark. The key ingredient in arriving at the decomposition of 
Z into disjoint residue classes is 1-7-10, in which four basic properties of 
residue classes (mod m) are listed. These properties v/ere proved by viewing 
residue classes in terms of remainders Upon division by m. It is highly in¬ 
structive to arrive at the decomposition of Z into disjoint congruence classes 
by proving 1-7-10 in another way—specifically, by using only the basic 
properties (namely the reflexive, symmetric, and transitive properties) of 
congruence (mod m). The importance of this approach lies in the fact that it 
is a special case of a very general phenomenon that will be discussed in Chapter 
III. 

We start from the definition of \_a] m in the form 
b e | a | w <=> b = a (mod m). 

The reflexive property says that a = a (mod m ), soae [p] m and part (0 of 
1-7-10 holds. Using the symmetric property we have: be\_a] m o b = 
a (mod m) o a = b (mod m) o ae \ b_\ m , which proves (//). As for (iii\ 
b e [a\ m implies b = a (mod m ), and using the transitive property, we see that 
for c e Z 


c e | a o c = a (mod m) o c = b (mod m) <=> c e \ b\ m 

which shows that = \ b _\ m . The proof of (iv) given earlier depends only on 
(w), so there is no need to modify it. 

1-7-14. Discussion. For each m > 0, we let Z m denote the set whose ele¬ 
ments are the congruence classes (mod m). Thus, Z w is a set with m elements, 
and we use the symbols A, B, C,... to represent elements of Z m . It would be 
nice to be able to combine elements of Z m —more precisely, for any two 
congruence classes A and B to be able to “add” and “multiply” them. 
Naturally, the results of addition and multiplication would be denoted by 
A + B and A • B , respectively. 

How might one go about this ? One possible method is as follows. Since 
A and B are congruence classes we may write A = [a_\ m , B = [^J OT , where 
a, be Z. We know how to add or multiply a and b , so it seems natural to 
take A + B and A • B as the congruence classes \ a+b | m and | ab | m , respec¬ 
tively. In other words, we make the definitions 


111 + llL = \ a + b i 

HJ m -Hlm=l^| m - 
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For example, if A = [_5_| 7 and B = |_6j 7 , then, according to the definition, 
A + B = [jjj 7 = \4] 1 m6A-B = [_30_| 7 = |_2j 7 . 

There is, however, a possible flaw in these definitions; they depend on the 
choice of the integers a e A and be B, What if, instead of writing A = , 

B = [b\ m , we write A = , 2? = [1/ J m where a', b' e Z? Do we obtain the 

same answers for + B and ,4 • J? when and b' (instead of a and 2?) are used 
as the “representatives” of A and B , respectively? In other words, when a' 
and V are used, the definitions say that 


I “'I - + [*J. - + 


Do these definitions give the same answers as before? The critical question is, 
therefore, to decide if 


a' + b’ = a + b and I a'b' I =\ab\ . 

1 m m I I m 1 I m 

Let us illustrate what is troubling us with a numerical example. Suppose 
A = | 5 | 7 , B = 1 61 7 , so A + £ = | 5+6 | 7 and A • B = | 5.6 | ,. Now, A and B 
are sets 


A = 16, -9, -2, 5, 12,19, 26,...}, 

-15, -8, -1,6,13, 20, 27, 34,...}, 

and can be expressed in the |_| 7 notation in many ways. In fact, for any 

xeA , yeB we have A — [^J 7 , B= [yj 7 . For concreteness, suppose we 
write A = |~9 | 7 , i? = | 27 | 7 ; then A + B = |~9+27 | 7 and A • B = 

| (-9) (27) | 7 . Are these the same congruence classes A + B and AB as before? 
So we need to know if 

15 + 61 7 = | —9 + 271 7 and |5 • 6| 7 = |(-9)(27)| 7 . 

Of course, these equalities do hold here—the sums both equal |_4j 7 , and the 
products both equal [_2j 7 . But all this involves specific choices. What if the 
choices are arbitrary? The reader should try to convince himself that for any 
xeA, ye B we still obtain 1 * + y [ 7 = |_4j 7 and [ xy | 7 = [2j 7 . 

Returning to the general situation, we note that the mathematical language 
commonly used to express the question facing us is to ask: are the definitions of 
addition and multiplication of congruence classes well defined? The term 
“ well defined ” refers to the choices that enter into the definitions—namely, 
a e A and be B; we need to guarantee that for all possible choices of a and 
b the definitions give the same result. In our case, this boils down to showing: 
if Ifll m = LfJ m and [*J m = \ b] m , then \a' + b' | m = | a + b | m and | a'b' \ m = 
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Once the question has been formulated, it is easy to provide the answer. 
^ [film = 1 a L and 1 b' \ m = \b\ mi then, by 1-7-11, a! = a (mod m) and b f = 
b (mod m). According to 1-7-4, congruences may be added or multiplied; so 
a' + b' = a + 6 (mod m) and a'b f = ab (mod m) —which says that [ a'+ b' | m = 
|g + &L and \a'b’ L = \ab\ m . 

We have shown that the rather natural definitions of addition and multi¬ 
plication of congruence classes (mod m) are indeed well defined. 

1-7-15. Example. Consider the set 

z, - {[Oj,. LLl,, 12J,, 13J,, [4J T , [5J,, 16J,}. 

It is surely preferable, especially psychologically, to write the elements of Z 7 
in this way rather than in an outlandish form like 

z, - {| 9IJ, , [-6J,, [37J,, L-25J, ,|-31| t ,|S4| t ,|6|,}. 

We know how to add and multiply elements of Z 7 ; for example, 

Lil, + - liJ,. HJ, + l£|,-[2J,. lik + li-lU,- 

12], •[!], = [«],. IUt • LU-/ ~ LLl-7* [2], • 111,-[£],• 

More generally, we can make tables for addition and multiplication in Z 7 . 
For this it is convenient to drop the cumbersome |_| 7 notation and write 

Z 7 = {0,1,2, 3, 4, 5, 6}, 

with the understanding that each of these integers represents its congruence 
class. Thus, the examples above take the form 

3 + 5=1, 2 + 5=0, 2 + 3 = 5, 

2-3 = 6, 3-4 = 5, 2-6 = 5 

and the desired tables are as follows: 


addition in Z 7 multiplication in Z 7 


+ 

0 

1 

2 

3 

4 

5 

6 

• 

0 

1 

2 

3 

4 

5 

6 

0 

0 

1 

2 

3 

4 

5 

6 

0 

0 

0 

0 

0 

0 

0 

0 

1 

1 

2 

3 

4 

5 

6 

0 

1 

0 

1 

2 

3 

4 

5 

6 

2 

2 

3 

4 

5 

6 

0 

1 

2 

0 

2 

4 

6 

1 

3 

5 

3 

3 

4 

5 

6 

0 

1 

2 

3 

0 

3 

6 

2 

5 

1 

4 

4 

4 

5 

6 

□ 

D 

2 

3 

4 


4 

1 

5 

2 

6 

3 

5 

5 

6 

□ 

1 

B 

3 

4 

5 


5 

3 

1 

6 

4 

2 

6 

6 

0 

l±l 

2 

111 

4 

5 



6 

5 

4 

3 

2 

1 
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Another way to represent Z 7 is: 

z 7 = { [ ~ 2 3 1 7 > 1 ~ 2 1 7 j LziJ 7 * L2J7 5 6 7 LLh j IAI7 ’ 1 Ut > 

= { —3, -2, -1,0, 1,2, 3} 

and with this notation the addition and multiplication tables for Z 7 take the 
form 


addition in Z 7 


+ 

0 

1 

2 

3 

-3 

-2 

-1 

0 

0 

1 

2 

3 

-3 

-2 

-1 

1 

1 

2 

3 

-3 

-2 

-1 

0 

2 

2 

3 

-3 

-2 

-1 

0 

1 

3 

3 

-3 

-2 

-1 

0 

1 

2 

-3 

-3 

-2 

-1 

0 

1 

2 

3 

-2 

-2 

-1 

0 

1 

2 

3 

-3 

-1 

-1 

0 

1 

2 

3 

-3 

-2 


multiplication in Z 7 


• 

0 

1 

2 

3 

-3 

-2 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

1 

2 

3 

-3 

-2 

-1 

2 

0 

2 

-3 

-1 

1 

3 

-2 

3 

0 

3 

-1 

2 

-2 

1 

-3 

-3 

0 

-3 

1 

-2 

2 

-1 

3 

-2 

0 

-2 

3 

1 

-1 

-3 

2 

-1 

0 

-1 

-2 

-3 

3 

2 

1 


1-7-16 / PROBLEMS 

1. Consider the integers 3, 19, 87, —15, —71, 96, 240, —113, 69, 378, —91, 
— 14, 500, — 312, 153; which of them are congruent to each other (mod m) 
when 

(/) m = 7, (ii) m— 11, (iii) m = 13. 

2. Show that 

(i) If a = b (mod m) and n \ m, n > 0, then a=b (mod n). 

(ii) Suppose that m u m 2 ,... ? m r are all positive, then a = b (mod m t ) 
for / = 1, 2,..., r o a s b (mod [m i , m 2 , ..., m r ]). 

3- How would you express a = 0(mod m) in another way ? 

4. What is the remainder when b = 2 50 — 1 is divided by<z = 31=2 5 — 1? 
What if b = 2 40 + 17? 

5. Find and prove a criterion for deciding if an integer is divisible by 9. Do 
the same thing for 3. Is 748,052,301,472 divisible by 9? by 3? by 11 ? 

6. How can one decide easily if an integer is divisible by 2? by 4 ? by 8 ? 
by 10? by 5? by 25? 

7. Show that for every « > 0‘ 

(/) 10" + 3 • 4" +2 leaves a remainder of 4 upon division by 9. 

(ii) 24 divides 2 • 7" + 3 • 5" - 5, 

(iii) 3 4 " +2 4- 5 2w+1 is divisible by 14. 
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8. (i) Find the remainder when a 1 is divided by 7 for each a satisfying 

0 < a < 7. 

(ii) For each integer a satisfying 0 < a < 11 find the integer less than 11 
which is congruent to a 11 (mod 11). 

(iii) Do the same thing with 11 replaced by 12. 

9. Prove: If p is prime and (a,p) = 1, then 

(i) a 2 = 1 (mod p)^>a= +1 (mod p), 

(ii) ab = ac (mod p)=>b = c (mod p) 

10. Show that if a = b (mod m), then (a, m) = (b, m). 

11. Choose three explicit elements a u a 2 , a 3 in |_3j 7 and explicit three ele¬ 
ments b u b 2 , b 3 in [9j 7 . Consider the nine numbers a t + bj Uj = 1, 2, 3, 
and verify that they all belong to the same congruence class—namely, 

| 5 | 7 . Furthermore, show that the nine products i,j= 1,2, 3 all 
belong [_6j 7 . 

12. Which of the following congruence classes are equal?— 

\ 2 ± J». a,. li£J.. [^J 5 . LyzJ,. b^J,. I — l4 U - 

l~ M L ' 1 184 U- LLLl,. b2ZJ»> b£j.- liZU.* liZU.- l£J>- 
l^,. liU=. U21.. 12ZJ». bl5ls. biil.- H2J,- 
bJ 6 - 

13. Make addition and multiplication tables for: 

(0 Z 2 = (0, 1}, (ii) Z 6 = {0, 1, 2, 3, 4, 5} 

(iii) Z 5 = { —2, -1,0, 1,2}. 

14. Suppose f(x) = c 0 + c±x + c 2 x 2 + • • • + c n x? where c 0 , c l9 ..., c„ are 
integers, prove that if a = b (mod m ), then f(a) = f(b) (mod m). 

15. How are the residue classes in Z 6 related to the residue classes in Z 12 ? 

16. Show that if ab = ac (mod m), then b = c (mod m!(a , m)). 

17. Discuss the solutions of each of the following: 

(/) Ax = 2 (mod 6), (ii) Ax = 1 (mod 6), 

(iii) Ax = 4 (mod 6), (it;) x 2 = 1 (mod 15), 

(v) 3x = 800 (mod 11). 

18. Does there exist a positive integer n for which 

(0 2" = 1 (mod 13), (ii) 3" = 1 (mod 13), 

(iii) 2" = 1 (mod 17), (it?) 3" = 1 (mod 17)? 

In each case, find the smallest n that will do. 
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19. Show by example that a n = Z?”(mod m) need not imply a~b (mod m). 

20. A person makes a 480 purchase in a store. The purchaser has a one dollar 
bill and three pennies. The storekeeper has six dimes and seven nickels. 
How should the transaction be arranged ? In how many ways can it be 
done? What if the purchase is for 490? 


1-8. Radix Representation 


The major results of this chapter—for example, the properties of greatest 
common divisor, uniqueness of factorization, solving linear diophantine 
equations—all depend, in part, on the division algorithm. In this section, we 
discuss one more extremely important consequence of the division algorithm. 

Suppose we fix an integer a > 1. Consider any b > 0. According to the 
division algorithm we can write, uniquely, b = qa + r where 0 < r < a. In the 
Euclidean algorithm, we then apply the division algorithm to a and r and 
repeat the process until we arrive at a remainder of 0. Here, instead, we apply 
the division algorithm to q and a and repeat the process until we arrive at a 
quotient q which is 0. More precisely, rewriting the equation we already have 
as b = q 0 a -f r 0 , and then indexing in the obvious fashion, we have 

b = q 0 a + r 0 , 0 < r 0 < a, 

q 0 =qi<i + ru 0 <r t <a, 

<h = <h a + r 2 , 0 < r 2 < a. 


<bn -2 = + r n -U 0 ^ r n -1 < ^ 

q n -1 = 0-a + r n , 0 < r w < a. 

The index n is taken here as the smallest integer greater than or equal to 0 for 
which q n = 0. Of course, it is necessary to show that our process always gets 
to a quotient q n which equals 0. This is quite easy. Because a > 1 and b > 0, 
the first equation implies that b > q 0 > 0. If ^ q 0 = 0, we are finished; so suppose 
q 0 > 0. The second equation then exists, and it implies q 0 > q t > 0. Thus, 
as our process unfolds, we obtain 

b > q 0 > q \ > q 2 > * ’ ’ ^ 0 

which is a strictly decreasing sequence of nonnegative integers. After a finite 
number of steps such a sequence must reach 0; that is, eventually we obtain 

q n = 0 . 

Note that once we locate the n for which q n = 0, our choice of notation 
guarantees that q n - 1 > 0 and r n = q n _ i ^0. 
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What good is all this ? Let us substitute for q 0 in the first equation—this 
gives 

b = q 0 a + r 0 
= + r x )a + r 0 

= q t a 2 + r x a + r 0 . 

Substituting for q 1 in this expression, we obtain 

b = (q 2 a + r 2 )a 2 + r y a + r 0 
= q 2 a 3 + r 2 a 2 + r t a + r 0 . 

This procedure is repeated until, at the end, we have 

b = q n - 2 a n ~ l + r„- 2 a”~ 2 + • • • + r 2 a 2 + r x a + r 0 
= (tln-ia + + r n _ 2 a"~ 2 + --- + r 1 a + r 0 

= <Jn -+ r, i-io ” -1 + r n _ 2 (f~ 2 + --- + r 1 a + r 0 
= r „cf + r n - 1 ef~ 1 + ••• + r 2 a 2 + r t a + r 0 , 

where 0 < r 0 , r u ..., r n < a (meaning that r 0 , r i9 ..., r n are all greater than 
or equal to 0 and less than a) and r n ^ 0. 

This shows, in rough terms, that every positive integer b can be expanded 
in powers of a; and even more, we have a recipe for finding such an expansion 
for b . We illustrate with a numerical example. Immediately, thereafter, we 
shall give the precise formulation of our general result. 

1-8-1. Example. Consider a = 2, b = 489. The division algorithm pro¬ 
cedure gives 

489 = 244-2 + 1, 

244 = 122-2 + 0, 

122 = 61 • 2 + 0, 

61 =30-2 + 1, 

30 = 15-2 + 0, 

15 = 7-2 + 1, 

7 = 3-2+ 1, 

3 = 1 - 2+ 1, 

1 =0-2+ 1. 

Although it is not essential, we list the values of the q’s and r’s. Starting from 
b = 489, a = 2 and working from the top down we have: q 0 = 244, r 0 = 1, 
q t = 122, r x = 0, q 2 = 61, r 2 = 0, q 3 = 30, r 3 = 1, q 4 = 15, r 4 = 0, q 5 = 7, 
r 5 = 1, q 6 = 3, r 6 = 1, q 7 = 1, r 7 = 1, # 8 = 0, r 8 = 1, Moreover, the r’s provide 
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the coefficients in the expansion of 489 in terms of powers of 2; in fact, we 
have immediately 

489 = 1 • 2 8 + 1 • 2 7 + 1 • 2 6 + 1 • 2 5 + 0 - 2 4 + 1 -. 2 3 + 0 • 2 2 + 0 • 2 1 + 1. 

Note that running through the list of remainders in our process from top to 
bottom (that is, from r 0 to r „, where n = 8) determines the coefficients of the 
powers of 2, going from right to left. 

Similarly, for a = 7, b = 489 we have 

489 = 69 • 7 + 6, 

69= 9-7 + 6, 

9 = 1-7 + 2, 

1= 0-7 + 1, 

so that 

489 = 6 + 6 • 7 1 + 2 • 7 2 + 1 • 7 3 
= 1 • 7 3 + 2 • 7 2 + 6 • 7 1 + 6 

and for a = 3, b = 543 we obtain 

543 = 181-3 + 0, 

181 = 60-3 + 1, 

60= 20-3 + 0, 

20 = 6-3 + 2, 

6 = 2-3 + 0, 

2 = 0-3 + 2, 

so that 

543 = 0 + 1 • 3 1 + 0 • 3 2 + 2 • 3 3 + 0 • 3 4 + 2 • 3 5 
= 2 • 3 s + 0 • 3 4 + 2 • 3 3 + 0 • 3 2 + 1 • 3 1 + 0. 

1-8-2. Theorem. If a > 1 is fixed, then any b > 0 can be expressed uniquely 
in the form 

b = r„<f + r + ••• + r 2 a 2 + r t a + r 0 , 

where 

0<r o ,r u ...,r n <a and r„ # 0. 

Proof : It is understood that n is not fixed; it depends on the choice of b. 

Of course, n > 0. The case n = 0 occurs when b < a, as then b = 0 • a + r 0 . 
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To keep the notation uniform one might prefer to write r 0 as r 0 d 0 in the state¬ 
ment of the theorem, as then there is no need to point up the case n — 0. 

The existence of an expression for .6 of the desired form was dealt with 
earlier. The reader can easily supply the missing details, so we shall consider 
this part proved. 

It remains to prove the uniqueness of the expansion for every b > 0. For 
this, suppose b is an integer with two expressions 

b = r n a n + • • • + r t a + r 0 = s m a m + • • • + s t a +s 09 

where 0 < r 0 , r l9 ..., r n < a 9 r n # 0, 0 < s 0 , s i9 ..., s m < a, s m ^ 0. We must 
show that r 0 = s 0 9 r t = s l9 ... 9 r n = s n and m = n. As a first step, let us re¬ 
write the two expressions for b as 

b = (r n a n 1 + *** + r 1 )a + r 0 = (s m a m 1 + • • • + s x )a + s 0 . 

Now, both of these express the division algorithm for b and a; so, by unique¬ 
ness of the division algorithm, we have r 0 = Sq and 

r n a n ~ 1 + •••r 2 a + r t = + ••• + s 2 a + s t . 

Applying the same procedure to these two expressions gives r i = s i and 

r n a n ~ 2 + ••• + r 2 =s m a m ~ 2 + ••• + j 2 . 

This procedure may be repeated as many times as necessary. In particular, if 
m — n, then we obtain, sequentially, r 0 = s 0 , r t = s u ..., r m = s m , and the 
proof of uniqueness is complete in this case. On the other hand, if m # n, we 
may suppose without loss of generality, that m <n. Then after m + 1 steps 
of our process, we have r 0 = s 0 ,r t = s u ... 9 r m = s m , and 

r n d'~ m + • • • + r m+2 a 2 + r m+1 a = 0. 

Each term on the left side is greater than or equal to 0. Furthermore, r n > 0 
and n — m> 0 (because n > m ), so the leading term r n d i ~ m is greater than 
0. Thus, the left side is greater than 0, a contradiction. Because the assumption 
that m^n leads to a contradiction, we conclude that m = n, and the proof is 
complete. | 

This theorem, which is concerned with what is known as the representation 
of b in base a (or with radix a ), is of immense importance. Among other things, 
it underlies our standard notation for integers, and this notation determines 
the rules according to which we compute with integers. This is not a trivial 
matter. The Romans had a rather cumbersome notation for integers (nowa¬ 
days, we refer to its a “ Roman numerals ”) but it was completely unsatis¬ 
factory for treating addition or multiplication; for example, in this system, 
how does one compute XVIII + LIX or XVIII -LIX? By contrast, our 
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“modern” notation is so incisive that we can teach elementary school 
children how to add, subtract, multiply, and divide integers, even though they 
(and, perhaps, their teachers) may not understand why the techniques are 
valid. 

Historically, man learned how to count and deal with numbers long ago. 
Consider, for example, the following translation of a passage from Genesis: 

... And Methuselah lived seven and eighty years and one hundred years, and begot 
Lamech. And Methuselah lived after he begot Lamech two and eighty years and seven 
hundred years, and begot sons and daughters. And all the days of Methuselah were 
nine and sixty years and nine hundred years, and he died. And Lamech lived two and 
eighty years and one hundred years, and begot a son. And he called his name Noah, 
saying; this one shall comfort us in our work and in the toil of our hands, which 
comes from the earth which the Lord has cursed. And Lamech lived after he begot 
Noah five and ninety years and five hundred years, and begot sons and daughters. 
And all the days of Lamech were seven and seventy years and seven hundred years, 
and he died .... 

Thus, we see that even in “ ancient ” times it was known how to express 
numbers (that is, positive integers) in words, and also how to add. Unfortu¬ 
nately, there was no notation available that facilitated computation with 
integers. Presumably, computations had to be done by brute force. For ex¬ 
ample, to add one hundred eighty two and five hundred ninety five one takes 
a pile of one hundred eighty two stones and another pile of five hundred 
ninety five stones, combines them into a single pile and counts—ending up 
with seven hundred seventy seven. Actually, one does not have to do the 
counting in so primitive a fashion. As the biblical terminology indicates, 
numbers involve certain groupings, and these simplify the addition. Thus, the 
first collection of stones consists of: one group of a hundred, one group of 
eighty, and one group of two, while the second collection of stones consists of: 
five groups of a hundred, one group of ninety, and one group of five. Combin¬ 
ing these gives: six groups of a hundred, one group of eighty plus ninety, and 
one group of seven. Now the group of eighty plus ninety can be organized into 
one group of a hundred and one group of seventy. So in the end there are 
seven groups of a hundred, one group of seventy, and one group of seven! 

What does multiplication involve when there is no satisfactory notation for 
numbers and they are given solely in words? A single illustration should 
suffice. Suppose we want to multiply twenty three by thirty seven; this means 
we must count the number of objects in a collection consisting of thirty seven 
groups with twenty three objects in each group—or equivalently, we need to 
count the number of elements in a rectangular array with twenty three rows 
and thirty seven columns. (Of course, the roles of rows and columns may be 
interchanged.) This is an onerous task, especially for large numbers, but it can 
be simplified by “cutting-up” the rectangle judiciously. 
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1-8-3. Notation. Having fussed a bit with numbers when a satisfactory 
notation is lacking, we turn to a quick sketch of how and why our standard 
notation for numbers works. 

We start from numbers given only by words. Let us fix a = ten. The theorem 
on radix representation says that any positive integer b can be written uniquely 
in the form 


b = r n ct' + 1 + • • • + r t a + r 0 (*) 

where 0 < r 0 , r u ..., r n < a and r n # 0. Of course, all the symbols here 
represent words and integers. The sequence of r’s clearly determines b com¬ 
pletely, and each of the r’s is a number less than ten. We introduce, therefore, 
the symbols 66 0, 1, 2, 3, 4, 5, 6, 7, 8, 9” for the numbers from zero through 
nine, respectively. (Actually, zero, which indicates an “ absence of number,” 
requires special treatment, but we are careless and lump it in with the others.) 
Thus, when we associate with b the sequence 


as shorthand for (*), we are writing b as a sequence of symbols of type 
0, 1,2, 3, 4, 5, 6, 7, 8, 9 and such that the first one (that is, the one on the left) 
is not equal to 0. This sequence of symbols is normally referred to as the 
number itself, whereas it is really a representation (expression, or expansion) of 
the number b in the base a. Every number has a unique expression of this 
form. The expansion for one, two,..., nine are clear—namely, 1, 2,..., 9, 
respectively. What is the expansion for a = ten ? Since 

a = 1 • a -f 0, 


we see that 


a = ten = 10. 

What is the notation for one hundred ? Since one hundred is the number of 
objects in a ten by ten array, we have 

one hundred = ten • ten = 10*10 = a*a = tf 2 =l*tf 2 -f0*a + 0 

so that 


one hundred = 10 • 10 = 100. 

Similarly, we may see that the notation for one thousand is 1000, and so on. 
In general, r n r n - x • • • r t r 0 represents 

r n(10)" + r„_ 1 (10)" _1 + • • • + ^(lO) + r 0 
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so it follows that (10)" whose expansion is surely 


is denoted by 


(10)" = 1(10)" + O(IO)"" 1 + • • • + 0(10) + 0 
(10)"= 100---0, n> 1 


where n zeros appear on the right. 


We may note in passing that we are now “entitled” to read numbers (or 
better, symbols representing numbers) in the usual way, For example, 4352 
(which is identified as four, three, five, two,... the way one lists a telephone 
number) represents 

4(10) 3 + 3(10) 2 + 5(10) + 2 = 4(1000) + 3(100) + 5(10) + 2 

so it is four thousand three hundred fifty two. 

Let us turn to the question of how to compute. First of all, we must know 
how to add and multiply numbers from 0 to 9. This is accomplished by 
primitive means—that is, by counting. For example, we obtain 6 + 8 = 14, 
9 + 7 = 16, 6 • 8 = 48, 9 • 7 = 63. All these facts may be listed in two tables, 
one for addition and the other for multiplication; these are the tables that 
elementary school children used to spend so much time memorizing in the not 
too distant past. 

In one leap forward we can now add or multiply any two integers—this 
involves use of the tables, and keeping the meaning of our notation clearly in 
mind. For example. 

35,874 + 6196 = [3(10) 4 + 5(10) 3 + 8(10) 2 + 7(10) + 4] 

+ [6(10) 3 + 1(10) 2 + 9(10) + 6] 
= 3(10) 4 + (5 + 6)(10) 3 + (8 + 1)(10) 2 + (7 + 9)(10) + (4 + 6) 
= 3(10) 4 + (1(10) + 1)(10) 3 + 9(10) 2 + (1(10) + 6X10) +1(10) 
= 3(10) 4 + 1(10) 4 + 1(10) 3 + 9(10) 2 + 1(10) 2 + 6(10) + 1(10) 
= (3 + 1)(10) 4 + 1(10) 3 + (9 + 1)(10) 2 + (6 + 1X10) 

= 4(10) 4 + 1(10) 3 + (10)(10) 2 + 7(10) 

= 4(10) 4 + 1(10) 3 + 1(10) 3 + 7(10) 

= 4(10) 4 + 2(10) 3 + 0(10) 2 + 7(10) + 0 
= 42,070. 


The standard way to write this addition is 

35874 
+ 6196 


42070 
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and the reader should convince himself that the way he performs this addition 
is just a compact version of the gory details given above. The same procedure 
obviously works for any addition problem. 

As for multiplication, when we write, for example. 

726 
x 354 
2904 
3630 
2178 
257004 

this is just a telescoped version (as the reader may verify) of 

(726) • (354) = [7(10) 2 + 2(10) + 6] • (3(10) 2 + 5(10) + 4] 

= (7 • 3)(10) 4 + (2 • 3)(10) 3 + (6 • 3)(10) 2 
+ (7 • 5)(10) 3 + (2 • 5)(10) 2 + (6 • 5)(10) 

+ (7 • 4)(10) 2 + (2 • 4X10) + (6 • 4) 

= ( 2 ( 10 ) + l)( 10) 4 + 6 ( 10) 3 + ( 1 ( 10 ) + 8 )( 10) 2 

+(3(10) + 5)(10) 3 + (1(10))(10) 2 + (3(10))(10) 

+ (2(10) + 8)(10) 2 + 8(10) + 2(10) + 4 
= 2(10) 5 + 1(10) 4 + 7(10) 3 + 8(10) 2 
+ 3(10) 4 + 6(10) 3 + 3(10) 2 
+ 2(10) 3 + 9(10) 2 + 4 
= 2(10) 5 + (1 + 3)(10) 4 + (7 + 6 + 2)(10) 3 
+ (8 + 3 + 9)(10) 2 + 4 

= 2(10) 5 + 4(10) 4 + (1(10) + 5)(10) 3 +(2(10))(10) 2 + 4 
= 2(10) 5 + 4( 10) 4 + 1(10) 4 + 5(10) 3 + 2(10) 3 + 4 
= 2(10) 5 + 5( 10) 4 + 7(10) 3 + 0(10) 2 + 0(10) + 4 
= 257,004. 

The same procedure obviously works for any multiplication problem. 

Subtraction presents no serious difficulties; it works very much like addi¬ 
tion. The reader can easily choose two integers, use the expanded notation 
(meaning that the powers of 10 are kept) to subtract the smaller from the 
larger one, and observe how this leads to the standard notation and procedure 
for performing the subtraction. Of course, the key thing is to understand 
“borrowing.” 
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Division is more difficult; it is not quite straightforward, and involves some 
trial and error. The reader may wish, as a challenge, to explain in detail how 
the division 


124 
37 | 4589 
37 
"88 
74 
149 
148 

comes from something like 

4(10) 3 + 5(10) 2 + 8(10) + 9 = (3(10) + 7)(1(10) 2 ) + 8(10) 2 + 8(10) + 9, 
8(10) 2 + 8(10) + 9 = (3(10) + 7)(2(10)) + 1(10) 2 + 4(10) + 9, 
1(10) 2 + 4(10) + 9 = (3(10) + 7)(4) + 1, 

SO 

4(10) 3 + 5(10) 2 + 8(10) + 9 = (3(10) + 7)(l(10) 2 + 2(10) + 4) + 1. 

1-8-4. Remarks. The notation discussed above is known as positional 
notation or as a place value system because each 66 digit ” (depending on its 
position or place) is associated with an appropriate power of ten. The final 
step in the historical development of this notation was the use of a symbol 
“0” for missing terms. Of course, the choice of a — ten as the base is fairly 
natural; through the ages, man usually did his counting in terms of tens be¬ 
cause he had ten fingers. However, any other choice of a > 1 as a base is 
valid and feasible—it leads to the same kind of notation and computational 
procedures. 

Let us illustrate by taking a = seven. There is no harm in writing this as a = 
7; in general, instead of naming numbers by words, we find it convenient to 
write them in the common, base ten, notation. According to the theorem on 
radix representation 1-8-2, every positive number b has a unique expression 

b = r n • T + * T~ 1 + • • • + r x • 7 + r 0 

where 0 < r 0 , r l9 ..., r n < 7, r n ^ 0. We abbreviate this to 

b = {r„r „-1 • • • /ir 0 ) seven 
or 

b = {r n r n -1 •••r 1 r 0 ) 1 
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and call it the representation of (or expansion of, or expression for) b in base 7. 
Note that in the base 7 expansion of a number only the symbols (or digits) 
0, 1,2, 3, 4, 5, 6 can appear; an expression like (207,185) 7 is meaningless. 

The subscript seven, or 7, in the notation is essential. It signifies in which 
base we are working. As a general rule, when an expression is given without 
subscript, the base ten will be understood. Thus 2156 means 2(10) 3 + 1(10) 2 
+ 5(10) + 6, while (2156) 7 means 2(7) 3 + 1(7) 2 + 5(7) + 6—clearly, these are 
different numbers. As a matter of fact (2156) 7 = 2(343) + 1(49) + 5(7) + 6 = 
776 (by which is meant 7(10) 2 +7(10) +6. 

We know that every number has a unique representation in base 7; can 
we actually find it ? Fortunately, to settle this question it is only necessary to 
recall some things that have already been done. The existence part of the 
theorem on radix representation 1-8-2 was proved in a “constructive” 
fashion—namely, we produced (that is, constructed) the r’s of the expansion 
by listing the remainders that arise by repeated application of the division 
algorithm. This gives a detailed recipe for finding the base 7 (or any other 
base) representation of a number. Even more, this recipe has already been 
illustrated in 1-8-1, where in one of the examples we arrived at 

489 = 1 • 7 3 + 2 • 7 2 + 6 • 7 + 6 


—which says that 

489 = (1266) 7 . 

In other words, the number four hundred eighty nine (which is written as 489 
in the base ten) has the representation (1266) 7 in base 7. 

As a trivial example of the recipe, we apply it to 6 and obtain 6 = 0 • 7 + 6; 
hence, 


6 = (6) 7 . 

Of course, this is also obvious directly—after all, the unique expression for 6 
in terms of powers of 7 is clearly 

6 = • • • + 0 • 7 2 + 0- 7 + 6. 

In exactly the same way, we see that 

1=(1) 7 , 2 = (2) 7 , 3 = (3) 7 , 4 = (4) 7 , 5 = (5) 7 . 

Let us go one step further and find the expressions for 7 and 8 in base 7. 
There is no need to use the recipe because the expressions are obviously 

7 = 1 • 7 + 0 and 8 = 1 • 7 + 1 
so 

7 = (10) 7 and 8 = (11) 7 . 
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In order to compute in base 7 we must first know how to operate with the 
digits 0, 1, 2, 3, 4, 5, 6, which are the only ones used in base 7. But this is 

easy; for example, 

(5) 7 + (6) 7 = 5 + 6 = 11 = 1- 7 + 4 = (14) 7 
(5) 7 • (6) 7 = 5- 6 = 30 = 4- 7 + 2 = (42) 7 

In fact, it is easy to make tables for addition and multiplication which contain 
all the required data. 

0 
1 
2 

3 

4 

5 

6 


addition in base 1 


0 1 2 3 4 5 6 


0 

1 

2 

3 

4 

5 

6 

1 

2 

3 

4 

5 

6 

10 

2 

3 

4 

5 

6 

10 

11 

3 

4 

5 

6 

10 

11 

12 

4 

5 

6 

10 

11 

12 

13 

5 

6 

10 

11 

12 

13 

14 

6 

10 

11 

12 

13 

14 

15 


multiplication in base 1 


•10 1 2 3 4 5 6 


0 

0 

0 

1 0 

0 

0 

0 

0 

1 

2 

3 

4 

5 

6 

0 

2 

4 

6 

11 

13 

15 

0 

3 

6 

12 

15 

21 

24 

0 

4 

11 

15 

22 

26 

33 

0 

5 

13 

21 

26 

34 

42 

0 

6 

15 

24 

33 

42 

51 


It is understood that all the entries in the tables should really include the 
subscript 7. As for the digits 0, 1,2, 3, 4, 5, 6, these can be viewed either in 
base ten or base seven without affecting anything, because if a is one of these 
digits, then, as seen above, a = (a) 7 . In particular, both 5 • 5 = (34) 7 , and 
(5) 7 • (5) 7 = (34) 7 may be read off from the table. 

1-8-5. Examples. We give several examples concerning numbers in various 
bases. 

(1) What numbers are represented by (23014) 5 and (10110101) 2 ? 
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Answer: 

(23014) 5 = 4+ l- 5+ 0-5 2 + 3-5 3 + 2-5 4 = 1634, 
(10110101) 2 = l+ 0- 2 + l-2 2 + 0-2 3 

+ 1 • 2 4 + 1 • 2 s + 0 • 2 6 + 1 • 2 7 = 181. 


(2) Find the expression for 40152 in base 7, and also in base 6. 
Answer: The division algorithm recipe takes the form 

40152 = 5736 -7 + 0, 

5736 = 819-7 + 3, 

819= 117-7 + 0, 

117= 16-7 + 5, 

16= 2-7 + 2, 

2 = 0-7 + 2, 

so that 

40152 = (225030) 7 . 


It is convenient to arrange the repeated divisions as follows: 


7 | 40152 


1 

5736 


819 

117 


16 

T? 


0 


0 

3 

0 

5 

2 

2 


where the remainders are placed to the right of the dotted line. Thus, to express 
40152 in' base 6, we compute 


40152 


6692 


1115 


185 


| 30 

Li 

I o 


0 

2 

5 

5 

0 

5 


and conclude that 


40152 = (505520) 7 . 
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(3) Perform the following additions: 

(0 (10011101) 2 + (1110110) 2 , (ii) (12212) 3 + (21021) 3 , 

(iii) (2163) 7 + (4565) 7 . 

Solution: We have in the first instance 

(10011101) 2 
+ (1110110) 2 
(100010011) 2 

which is a reflection of what happens when everything is written out in terms 
of powers of 2. One may then check this addition by transferring to the more 
familiar base 10—and see that indeed 157 + 118 = 275. 

In case (//'), we have 


(12212) 3 
+ (21021) 3 
(111010) 3 

which, in base 10, says that 158 + 196 = 354. 

Turning to case (iii), 

(2163) 7 
+ (4565) 7 
(10061) 7 

which is another version of 780 + 1664 = 2444. 

(4) Perform the following subtractions: 

(0 (23014) 5 - (4123) s , (//) (505520) 6 - (41432) 6 , 

(iii) (4562) 7 - (2165) 7 . 

Solution: We have 

(23014) 5 (505520) 6 (4562) 7 

- (4123) 5 - (41432) 6 - (2165) 7 

(13341) s (424044) 6 (2364), 

and in base 10 these become 1634 — 538 = 1096, 40152 — 5564 = 34588, 
1661 - 782 = 879, respectively. 

(5) Perform the following multiplications: 

(0 (11001) 2 • (1101) 2 , (ii) (1342) 5 • (314) 5 , 

(Hi) (564) 7 • (403) 7 . 
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Solution: In case (/), we write (in base 2) 

11001 
x 1101 
11001 
110010 
11001 
101000101 

while the corresponding multiplication in base 10 is 25 • 13 = 325. 

The work for (») looks as follows (with base 5 understood) 

1342 
x 314 
12023 
1342 
10131 
1044043 

and in base 10 this translates to 222 • 84 = 18648. 

As for (Hi), working in base 7, we have 

564 
x 403 
2355 
32520 
330555 

and in base 10 this translates to (291)(199) = 57909. 

(6) Carry out the division algorithm (that is, divide) for the pairs of 

numbers 

(0 (11101) 2 and (111011000111) 2 
(ii) (23), and (225030) 7 . 

Solution: The division in base 2 looks like 

10000010 

11101 | 111011000111 
11101 
ooioooi1; 
moi; 
noi 
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Thus we have 

( 111011000111) 2 = ( 10000010) 2 • ( 11101) 2 + ( 1101) 2 
which is another way of saying 

3783 = (130)(29) + 13. 

As for part (ii), the division in base 7 takes the form 

6612 
23 | 225030 

204;;; 

"21b;; 

204 |; 

33; 

23! 

100 

_46 

21 

We have, therefore, 

(225030) 7 = (6612), • (23) 7 + (21) 7 
which indeed corresponds to the base 10 equation 

40152 = (2361) -(17) + 15 

1-8-6. Remark. Instead of looking only at bases smaller than 10, as has 
been done so far, let us consider a base greater than 10—namely, the base 12. 
In order to represent numbers in base 12 we need symbols for the integers 
which are greater than or equal to 0 and less than twelve. Let us use the 
standard digits 0, 1,..., 9 for the integers from zero through nine, respec¬ 
tively, and then introduce the symbols 

t for ten, e for eleven. 

Maintaining our convention that numbers without subscripts are in base 10, 
we may note that 

10 = (t) 12 , 11= (e) 12 , 12 = (10) 12 , 13 = (11) 12 

21 = (19) 12 , 22 = (1t) 12 , 23 = (ls) 12 , 24 = (20) 12 , 

and so on. Of course, the unadorned symbols t and e have no meaning in 
base ten; they exist only in the context of base twelve. 

To compute in base 12, one must know how to operate with the digits 
0, 1, 2, ..., 9, t, e. This information may be gathered into tables for addition 
and multiplication, among whose entries the following facts will be found. 
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( T )l2 + CO 12 — (l^)l2 ? (0l2 + (0l2 “ (19) 12 , (0l2 + (0l2 “ 00l2 > 

(0l2 * (0l2 = (84) 12 , (0l2 * (0l2 = (92) 12 , (0l2 * (0l2 = (Tl)i2 • 

These are easy to derive; for example, 

(0i2 + (0i2 = 10 + 11 = 21 = 1 . 12 + 9 = (19) 12 , 

(0i2 • (0i2 = 10 • 10 = 100 = 8 12 + 4 = (84) 12 , 

(O 12 • (O 12 = 11 • 11 = 121 = 10 • 12 + 1 = (rl) 12 . 

The connections between base 10 and base 12 work just like they did earlier 

when bases other than 12 were used. For example, to find the number (t5s 7) 12 
we write, 

(t5s7) 12 = t( 12) 3 + 5(12) 2 + e(12) + 7 

= 10(1728) + 5(144) + 11(12) + 7 
= 18139. 

Thus, (t5s 7) 12 = 18139, and we may check this by finding the expression for 
18139 in base 12, namely, 


12 | 

18139 


1 

1511 

7 


125 

11 

1 

5 

0 

10 


so because the remainders are 7, 11, 5, 10 the base 12 notation for 18139 is 
indeed (t5s7) 12 . 

As illustrations of how one adds and subtracts in base 12, we have 

(t5s7) 12 (t5s7) 12 

+ (469t) 12 , - (469t) 12 . 

(13095) 12 (5s19) 12 

Incidentally, the reader may check that in base 10 these become 18139 
+ 7894 = 26033 and 18139 - 7894 = 10245. 

Somewhat more effort is required to do a multiplication in base 12; for 
example (with subscripts dropped momentarily for convenience) 

27st 
x t3s 
253t2 
7ss6 
227t4 


2363742 
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so that (27st) 12 * (t3c) 12 = (2363742) 12 . In base 10 this says: 4606 • 1487 = 
6849122. 

Finally, let us divide (t5s7) 12 by (2e ) t2 , which corresponds to dividing 
18139 by 35. The work looks like 

372 
2s j" r5s7 
89j: 

18s* : 

185 : 

67 

5r 

~9 

so that 


(t5s 7) 12 = (372) 12 • (2e) 12 + (9) 12 . 
In base 10 this becomes 


18139 = 518*35 + 9. 


1-8-7. Nim. Consider the following game (known as Nim) for two players. 
At the start, the players are presented with three separate piles of chips, stones, 
or what have you. The number of objects in each pile is arbitrary, but greater 
than 0. Each player, in turn, then removes as many objects as he wishes 
(possibly even all of them) from exactly one of the piles, and throws them away. 
Of course, at his turn, a player cannot stand pat—he must remove one or 
more objects. The player who removes the last object wins. 

This game is far from trivial, as the reader will surely discover by playing 
it. However, it is, surprisingly perhaps, subject to precise mathematical 
analysis. 

At his turn, a player is confronted with a triplet of nonnegative integers 
{n l9 n 2 ,n 3 }, where n t represents the number of objects in the ith pile for 
i = 1, 2, 3. His move amounts to changing one of the n t to a strictly smaller 
nonnegative integer and keeping the other two fixed. Now, let us write out 
the base 2 expansions for n u n 2 ,n 3 , one under the other, with the columns 
lined up, and then count the number of l’s in each column. For example, if 
{n l9 n 2 , n 3 ) = {46, 27, 35} we write 

46 = 101110 
27= 11011 
35 = 100011 


count of l’s: 212132 
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(the count of l’s is simply a list which, for every column, tells how many l’s 
it contains) and if {n u n 2 , n 3 ) = {50, 20, 38}, we write 

50= 110010 
20 = 10100 
38 = 100110 
count of IV. 220220. 

We say that {« 1? n 2 , n 3 } is an even position when the count of l’s contains 
only even integers (0 is considered to be even). By an odd position we mean one 
that is not even; thus, an odd position is one for which the count of l’s 
contains at least one odd number. For example, {46, 27, 35} is an odd posi¬ 
tion, {50, 20, 38} is an even position, and {0, 0, 0} is an even position. Clearly, 
a position is either odd or even, but not both. 

Once we have a classification for positions, it is useful to classify moves. We 
say that a player has made a good move if he leaves his opponent with an 
even position, and a bad move if he leaves his opponent with an odd position. 
For example, a move that leaves the opponent in the position {18, 3, 17} is a 
good move, while a move that leaves him with the position {18, 2, 17} is a 
bad move. Clearly, a move is either good or bad, but not both. 

Now that the terminology is in place, we turn to two crucial facts, whose 
proofs are left to the reader. 

(1) If a player is confronted with an even position, then any move he 
makes is a bad move. (Before undertaking the proof, the reader should find 
it helpful to try this out on the even position {50, 20, 38}.) 

(2) If a player is confronted with an odd position, then he can make a 
good move—in other words, by selecting his move judiciously he can leave 
his opponent with an even position. (Before undertaking the proof, the reader 
should find it helpful to try this out on the odd position {46, 27, 35}. A move 
to the position {46, 13, 35} is a good move; in fact, it is the only good move a 
player can make when confronted with {46, 27, 35}.) 

Suppose a game is in progress, and at some point one of the players (call 
him A) is confronted with an odd position. According to (2), he can make a 
good move and leave B with an even position. Then, according to (1), B is 
forced to make a bad move; in other words, he cannot avoid leaving A in an 
odd position. Therefore, if A knows what he is doing, he can “guarantee” 
that every move he makes is a good move and that every move B makes is a 
bad move. The last move in the game, namely the winning move, is one that 
leaves the opponent with the even position {0, 0, 0}—so it is a good move. 
Since B is forced to make only bad moves, he never gets the opportunity to 
remove the last object (because this is a good move). This means that eventu¬ 
ally A wins. 
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The conclusion is that a player who understands the game is almost certain 
to win against a player who does not. We have “ solved ” the game of Nim, an 
old Chinese game. 

1-8-8 /PROBLEMS 

1. Make addition and multiplication tables for each of the following bases: 

(/) 2, («) 3, 077) 5, (iv) 6, (v) 12. 

2. Find the expression for 234,567 in each of the following bases: 

(0 2, ( ii ) 3, (iii) 5, (iv) 6, (v) 7, (vi) 12. 

3. Find the following numbers: 

(i) (1101001101) 2 , (ii) (211020121)3, 

(ffi) (32414) 5 , (iv) (150423) 6 , 

00 (32414),, (vi) (1ts93) 12 . 

4. What are the advantages and disadvantages of the binary system (meaning: 
base 2) as compared with the duodecimal system (meaning: base 12)? 

5. Find the expression for 

(i) (1101001101)2 in base 3, (ii) (32414) 5 in base 6, 

(iii) (32414) 7 in base 12, (iv) (t3s) 12 in base 5, 

(0 (41503) 6 in base 3, (vi) (202021) 3 in base 7. 

6. Perform the following additions, and check the results by transferring to 
base 10: 

(0 (11010101) 2 +(101011111) 2 , (ii) (41343) 5 + (24312) 5 , 

(Hi) (301246) 7 + (123456) 7 , (iv) (2t7s) 12 + (805t) 12 . 

7. Perform the following subtractions, and check the results by transferring 
to base 10: 

( 1 ) (202021)3 “ (12202)3, 07) (150423) 6 - (41503) 6 , 

(;77) (301246) 7 - (123456) 7 , (iv) (805t) 12 - (2t7s) 12 . 

8. Perform the following multiplications, and check the results by transfer¬ 
ring to base 10: 

0) (5423) 6 • (453) 6 , (ii) (3246) 7 - (564) 7 , 

07/) (85t) 12 • (t7s) 12 • 

9. Divide (and check your results) 

0) (111) 2 into (101101101) 2 , 07) (43) 5 into (41343) 5 

(iii) (38)i 2 into (2 t7s) 12 

10. Find the greatest common divisor of 

0) (212)3 and (20102)3, 07) (416) 7 and (3025) 7 . 

11. Howis thegameof Nim affected if thereare more than three piles at thestart ? 
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12. Can you find a criterion which tells if (r n r n _ i ••• ^ 0)7 is divisible by 
(11) 7 ? What about divisibility by ( 6 ) 7 ? 

13. I once knew a farmer who knew how to add and subtract, but had never 
been taught the multiplication table. However, he did know how to multiply 
by 2 and divide by 2. To multiply two integers he followed a procedure 
which we illustrate in the case 97 • 113. 

Divide 97 by 2 to get a quotient of 48, and throw the remainder away. 
Repeat this for 48, and keep going until a quotient 1 is reached. Starting 
from 97 write all the quotients, in order, in a column. To the right of this 
column, write another column with exactly the same number of entries 
as the first one. This column starts with 113, and each of the subsequent 
entries is obtained from the one preceding it by multiplying by 2. For each 
of the even entries in the left-hand column, cross out the corresponding 
entry of the right-hand column. Adding the remaining entries of the right- 
hand column gives the product 97 • 113. In full detail, we have: 


97 

113 

48 


24 


12 


6 

ymc 

3 

3616 

1 

7232 


10961 


so 97 • 113 = 10961. Explain why this method (which used to be standard 
operating procedure for Russian peasants) works. 

14. Given an integer in base 10, we know how to apply the division algorithm 
repeatedly in order to find its expression in any other base. Explain how 
the same idea may be used to translate a number given in an arbitrary 
base (#10) into base 10. For example, given a number in base 7, one 
divides it, in base 7, by (13) 7 (which is the base 7 expression for 10) and 
keeps using this divisor (13) 7 in all the “divisions.” 

Do Problem 3 by this method. 

Miscellaneous Problems 

1. Show that for any n > 1, 

(/) 3 | (2 2w — 1) (ii) 31 (2 2w_1 + 1) 

(in) {a — b)\{cf — b") (iv) (a+ b)\ (a 2n - b 2n ) 

(; v) (a + b) | (a 2 "' 1 + b 2 "- 1 ) (vi) (a 2 - b 2 ) | ( a 2n - b 2n ) 
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2. If n > 1 and (a, b ) = 1 show that (a n+i + b n+1 , d 1 + b n ) divides a — b. 

3. Let m and n be positive integers; show that 

(0 2 " -f 1 is prime => n is a power of 2 ; 

(ii) 2 " — 1 is prime =>« is prime; 

(Hi) (2 m — l)|(2 n - 1) o m\n. 

4. Prove that the square of an integer not divisible by 2 or 3 is of the form 
12 n + 1. 

5. If both a and b are prime to 3, show that a 2 + b 2 cannot be a perfect 
square. 

6 . If n is odd, prove that there are no prime triplets «, n + 2, n + 4 (meaning 
that all three are prime) except 3, 5, 7. 

7. Express the cube of any positive integer as the difference of the squares 
of two integers. 

8 . Do there exist distinct positive integers x, y , z for which x + y + z = xyzl 

9. Prove that n 2 — 6n + 10 is positive for every integer n. 

10. Find the square root of each of the following, in the appropriate base: 

(i) (11010010001) 2 , (ii) (2022021) 3 , 

(Hi) (11441) 6 , (iv) (4621) 7 . 

11. Consider the set T = {1, 3, 5,..., 2n — 1} and suppose 3 r is the highest 
power of 3 that belongs to T. Show that 3 r does not divide any other 
element of T. 

12. If a iy a 2 9 b i9 b 2 are integers with a 1 b 2 — a 2 b t = ± 1 show that a t + a 2 , 
and b t + b 2 are relatively prime [in other words, the fraction (a t + a 2 )l 
(b^ + b 2 ) is in lowest terms]. 

13. Suppose a and b are positive integers with a \ b 2 9 b 2 \ a 3 , a 3 | Z> 4 ,..., 
b 2n | a 2n+ \ a 2n+i | b 2n+2 9 ... (ad infinitum). Show that a = b. 

14. Show that 3 n 5 + 5 n 3 + In is divisible by 15 for every ne Z. 

15. State and prove a criterion that tells when an integer is divisible by 7. 

16. Can you find an integer n such that In has 

(/) all its digits equal to 1 ? (//) all its digits equal to 3? 

17. Consider the number (800)! = 800 • 799 • 798 • • • 2 • 1. When this number 
is multiplied out, how many zeros are there at the end ? 

18. Find 25 consecutive integers all of which are composite. More generally, 
show that there exist consecutive primes (meaning that all integers between 
them are composite) whose difference is arbitrarily large. 
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19. Use the v p ’s to prove: 

(/) If a | c, b | c, and ( a , b) — 1 then ab \ c. 

{ii) If {a, b) = 1 and ( a , c) = 1 then (a, Z?c) = 1. 

{in) If {a, b) = 1 and c | a then (c, Z>) = L 

(iv) If (a, be) = 1 then (a, 6 ) = (a, c) = 1. 

(i?) If (a, b) = 1 then ( ca 9 b) = (c, b). 

(i vi) If (a, b) = 1 then {ab, c) = (a, c){b , c). 

20. (/) Show that there are no nonzero integers m and n for which m 2 = 2 n 2 . 

Why does this imply that \/2 is “irrational”? 

{ii) More generally, if a > 1 is not the square of an integer, then m 2 = 
an 2 cannot be solved in nonzero integers m and n ; hence, \/a is 
irrational. 


21. (i) Find all pairs of positive integers {a, b} such that ( a , b) = 10 and 

[a, b] = 100. 

{ii) Find all triples of positive integers {a, b , c } such that {a, b,c)= 10 and 
[a, b , c] = 100 . 

22. Let two positive integers d and m be given. Show that there exists a pair 
of positive integers { a , b} for which {a, b) — d and [a , b] = m if and only 
if d\m. Moreover, in this situation, the number of such pairs is 2 r , where 
r is the number of distinct prime factors of mjd. 

23. Suppose two positive integers c and d are given. Show that there exists a 
pair of positive integers { a , b} for which : 

(/) a + b = c and {a, b) = do d\c, {ii) ab = c and {a, b) = do d 2 | c . 
In each case, can you count the number of such pairs? 

24. For any integers a , b , c show that 

(0 ([a, 6 ], [a, c], [b, c]) = [(a, b), (a, c), (b, c)] 

{ii) [a, b , be , ac) = |aZ>c| 

(«7) [a, 6 , c](tf, b , c) < |aZ?c|, and equality holds if and only if a , 6 , c are 
relatively prime in pairs. 

25. Exhibit three integers a, b , c such that (a, Z>, c) = 1 and 
(/) exactly one (//) exactly two (iii) all three 

of the pairs of integers { a , &}, {#, c}, { 6 , c} are relatively prime. 

26. (/) For s = | a t a 2 • • • a n \ # 0 show that 


— a 2 , •••, ( >>•••>/• 
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( ii ) If m > 0 is a common multiple of a u a 2 ,...,a„ (all nonzero) show 
that: 


m = [>!, a 2 . 



m 

a 2 



= 1 . 


27. (/) Consider the polynomial f(x) = x 2 -F x + 17. Verify that f(n) is 
prime for n = 1, 2,..., 16 but /(17) is not prime, 

(ii) Consider the polynomial f(x) = x 2 — x + 41. Verify that f(n) is 
prime for n = 1, 2,. .., 40 but/(41) is not prime. 

(in) Find the smallest positive integer n for which f(n ) — n 2 — 19n + 1601 
is composite. 


28. Prove that there exists no polynomial f(x ) with integer coefficients such 
that f(n) is prime for every positive integer 


29. Discuss the problem of solving the two simultaneous linear diophantine 
equations ^ 

a t x + b t y = c u 
a 2 x + b 2 y = c 2 . 


30. Prove that the linear diophantine equation a t x t + a 2 x 2 + ••• + a n x n = c 
has a solution (a,..., a n ) divides c . Sketch a method for finding all 
solutions. 


31. State and prove a necessary and sufficient condition that the linear dio¬ 
phantine equation ax + by = c have an infinite number of positive solu¬ 
tions. 


32. Suppose a and b are relatively prime positive integers, and consider the 
linear diophantine equation ax + by = c. 

(/) Give two proofs, one geometric, one algebraic, that there cannot be 
an infinite number of positive solutions. 

(ii) If c > ab show that a positive solution exists. 

(iii) For each n > 0 show how to construct a nontrivial example of this 
type that has exactly n positive solutions. 

33. Find the smallest and the biggest integer c for which the linear diophantine 
equation 5x + ly = c has exactly nine positive solutions. 

34. For n > 1, let t (n) denote the number of positive divisors of n and let a(n) 
denote the sum of all the positive divisors of n. 

(i) Find t (n) and o(n) for n = 11, ll 5 , (11)(13), (11) 5 (13) 7 , 6840. 

(ii) For m = 220, n = 284 show that a(m) = a(n) = m + n. Such integers 
m and n are said to be amicable numbers. 

(iii) If m and n are any relatively prime positive integers then 

x(mn) = x(m) • t(«), a(mn) = a(m) • o(ri). 
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(iv) Prove the following formulas for any n > 1: 

/ l+v p (n) _ l\ 

*(«) = IT (1 + v p (n)), o(n) = H (-:—). 

p p \ p ““ 1 / 

35. For n > 1 show that: t(«) is even <=> n is not a perfect square. Moreover, 
in this situation 

a (n) = n T(n)/2 . 

What happens in the case where t(«) is odd ? 

36. (/) For every integer r > 1 there are an infinite number of positive 

integers n with t (n) = r. 

(k) What is the smallest positive n satisfying z(n) = 15? 

37. (i) Find all n > 1 which satisfy o(ri) = s when s = 10. 

(ii) Do the same thing for s = 24 and s = 72. 

(m) Does there exist an s > 2 for which o(ri) = s has no solution ? 

38. (/) Can you express 1547 as a difference of two squares? In how many 

ways can you do so ? 

(ii) Do the same for 1768. 

(Hi) Describe a general procedure for settling this question for an 
arbitrary integer n . 

(iv) Show that n > 0 can be expressed as a difference of two squares if and 
only if it is of form 2k + 1 or 4k. Moreover, if n is an odd prime its 
expression as a difference of squares is unique. 

39. Find n > 1 for which n(n + 180) is a perfect square. How many such 
values of n can you find ? 

40. Show that the positive integer n can be expressed as a sum of consecutive 
integers [that is, n = m + (m + 1 ) + • • • 4 * (m + k)] if an only if n is not a 
power of 2 . 

41. If the positive integer n is odd, show that the sum of the first n integers 
divides their product. What happens to this result when n is even ? 

42. Do there exist integers a > b > 0, n > 1 for which a n — b n divides a n + b nt l 

43. Prove that 1 +■£ + ••• + $ is not an integer for any n > 1. 

44. Suppose a > 1 and n> 1. If a n — 1 is prime, show that a — 2 and n is 
prime. Such primes <f — 1 are known as Mersenne primes. Find a few of 
them. [Note: A number of form 2 P — 1 with p prime need not be prime; 
for example, 2 11 — 1 is composite.] 

45. (i) Suppose a > 1 and n > 1. If d 1 + 1 is prime, show that a is even and 

n is a power of 2 . 
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(ii) A prime of form 2 lm + 1 is said to be a Fermat prime. Find a few of 
them. [Note: a number of form 2 2m + 1 need not be prime; for 
example, 2 25 + 1 is composite—show, in fact, that it is divisible by 
641.] 

46. (i) Suppose a > 1 and m> n> 1. Show that 
(*) a 2 "+l divides a 2 ™— 1 , 

( 1 if a is even, 

2 if a is odd. 

(//) Use the infinite sequence 

= 2 21 + 1 , u 2 = 2 22 + 1, u 3 = 2 23 + 1 , = 2 2 " + l,. . . 

to prove that the number of primes is infinite. 

47. Given two integers a and b , it is customary to compute their greatest 
common divisor via the Euclidean algorithm. This involves repeated use of 
the division algorithm. Show that the number of steps (meaning: uses of 
the division algorithm) needed to compute (< a , b) is at most five times the 
number of digits in (the base 10 expression for) the smaller number. 

48. (/) Suppose we are given a balance scale along with the five known 
weights 1, 2 , 4, 8 , 16. Show that we can determine the weight of any 
object whose weight is an integer less than or equal to 31 by placing 
an appropriate choice of the given weights in one pan. 

(ii) If we are given a sixth known weight, 32, show that any integral 
weight less than or equal to 63 can be determined. 

(iii) Generalize these facts to n weights. How is this related to 1-8-2? 

49. (/) Suppose we are given a balance scale and the four known weights 
1, 3, 9, 27. Show that by placing some of these judiciously in one pan, 
or in both, we can determine the weight of any object whose weight 
is an integer less than or equal to 40. 

(ii) If we are given a fifth known weight, 81, show that, by using both 
pans, any integral weight less than or equal to 121 can be determined. 
(iii) Generalize these facts to n weights. 


50. If a > 1 is fixed show, that any be Z can be expressed uniquely in the 
form b = qa + r where 


a - 1 
2 


< r < 


a — 1 
2 


when a is odd, 



when a is even. 
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This may be referred to as the “least absolute value remainder” version 
of the division algorithm. 

51. (/) By repeated use of the least absolute value version of the division 

algorithm, starting with integers a and b 9 one is led to a new version of 
the Euclidean algorithm. [Note: If a remainder r t is negative, one 
uses |r E -| as the divisor in the next step.] Prove that this process also 
determines ( a , b). 

( ii ) Use this “new” Euclidean algorithm to compute (a, b) when a = 
63020, b = 76084. 

52. If a > 1 show that any b > 0 can be expressed uniquely in the form 

b = r„tf + r„_ 1 a n ~ 1 + ••• + r t a + r 0 , 

with r n > 0 and 

a - 1 a — 1 , ... 

- 2 ~ < r 0 , r t ,. . . ,r„ < —^— when a is odd, 

-5 <r »’ r ‘. r ' S ! when » is even. 

In particular, if a = 3 then any positive integer b can be expressed 
uniquely as a sum of distinct powers of 3 with coefficients 1,0, or — 1, 
and with leading coefficient 1; more precisely, 

b = 3" + r„_ 1 • 3" 1 +r„_ 2 -3" 2 + • • • + r t • 3 + r 0 , 

where — 1 < r 0 , r u ..., r n _ 1 < 1. How is this related to Problem 49 
concerning the use of integral weights on both sides of a balance scale ? 



RINGS AND DOMAINS 


In the preceding chapter, we assumed that the integers were known and 
proceeded to investigate additional properties of Z such as: division algo¬ 
rithm, Euclidean algorithm, unique factorization, greatest common divisor, 
least common multiple, congruence, and radix representation. At appropriate 
places in a number of proofs, we made use of certain commonly accepted 
properties of Z. Thus, our approach was somewhat informal in the sense that the 
axioms or assumptions about theintegerswereneverformalizedormadeexplicit. 

In this chapter, we adopt another point of view and examine the facts 
about Z that, heretofore, were assumed to be known. As a first step we intro¬ 
duce several algebraic axioms about addition and multiplication in an arbi¬ 
trary set—axioms which are satisfied in Z, in Z m , and also in many other 
familiar systems. Step by step, we introduce additional axioms and discuss 
some of their consequences. In the end, we arrive at a set of axioms that 
describes Z completely! 
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2-1. Rings: Elementary Properties 

In this section, we begin the formal axiomatic approach to algebraic 
systems; this is the point of view of modern algebra. As a first step, we 
introduce a very general algebraic object whose definition is motivated, in 
part, by the properties of Z and Z m . 

2-1-1. Definition. A ring is a nonempty set R , whose elements we denote 
by a,b,c,..., x,y, z, ..., together with two operations, addition (usually 
denoted by +) and multiplication (usually denoted by •), and such that the 
following properties hold: 

Al: closure under addition. To any pair of elements a and b of R (in the 
given order) there is associated an element a -F b in R. This element a + b is 
called the sum of a and b\ in words, a -F b is read as a plus b. 

A2: associative law for addition. For any elements a , b, c of R we have 
(a + b) + c = a + (b + c). 

A3: identity for addition. There exists an element 0 in R such that 
a + 0 = a for all a e R. 

A4: inverse for addition. Given any ae R there exists an element a e R 
such that 


a + a = 0. 


A5: commutative law for addition. For any elements a and b of R we have 

a + b = b -F a. 

Ml: closure under multiplication. To any pair of elements a and b of R 
(in the given order) there is associated an element a • b in R. This element 
a • b (which we shall usually write simply as ab) is called the product of a and b ; 
in words, a • b is read as a times b. 

M2: associative law for multiplication. For any elements a , b, c of R we 
have 


(ab)c = a(bc). 

Dl: distributive laws. For any a , b, c in R we have 

a(b + c) = ab + ac and (b + c)a = ba + ca. 

2-1-2. Discussion. The axioms for a ring may seem numerous and even 
mysterious, but with constant use they will eventually seem fairly natural. At 
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this stage, it is necessary to make a number of remarks designed to elaborate 
upon and clarify the definition. 

(1) Operations, as we have called them in the very first sentence of the 
definition, are often referred to as “binary operations.” To explain the mean¬ 
ing of this term, let R x R denote the set of all ordered pairs (< a , b) of elements 
of R; in symbols, 


R x R = {(a, b)\ae R, b e R}. 

Of course, ( a , b) has nothing to do with greatest common divisor. Rather, the 
notation is entirely analogous with the way the familiar x-y plane is formed 
from two copies of the number line, and each point in the plane is given by the 
ordered pair of its coordinates (x, y ). By a binary operation on R we mean any 
mapping of 


R x R -*R 

(the arrow is read as “ into ”)—that is, a rule which assigns to each ordered 
pair (a, b) e R x R an element of R. We have chosen to denote the two binary 
operations on a ring R by “ + ” and “ • ” respectively, in order to maintain the 
analogy with Z, and also because this is the standard notation. On rare 
occasions other notation will be used. 

(2) Strictly speaking, it is not correct to refer to a ring R. After all, to talk 
about a ring, the two operations must also be specified. Thus, a more accurate 
way to refer to a ring would be as {R, +, *}, where, by convention, the addition 
operation is listed first. However, we shall almost always use the abbreviated 
notation, R, because there will be no doubt about the notation for and 
meaning of addition and multiplication in the ring. 

(3) It is really part of the definition of a binary operation that we have 
closure under the operation. However, we have listed the closure axioms A1 
and Ml, for addition and multiplication, explicitly, in order to provide 
emphasis; without these axioms closure tends to be ignored or forgotten. 

(4) The sum a + b and the product ab are defined for all choices of a and b 
in R; in particular, a and b may be the same element of R. 

(5) We have a + b defined for all a, be R, but the expression a + b -f c has 
no meaning at this stage. The binary operation + tells us about adding two 
elements of R , but it says nothing about adding three elements. One might 
try adding a and b , and then adding this result and c; the end product would 
be expressed as {a + b) + c in keeping with the customary usage for paren¬ 
theses. On the other hand, there is one more way possible for adding a , b , c 
(in this order)—namely a + (b -f c ). The associative law for addition, A2, 
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asserts that (a + b) + c = a + (b + c) 9 so the two computations give the same 
result. Thus, the associative law allows us to use the notation a + b + c 9 
without any ambiguity, for both (a + b) + c and a + (b + c ). 

Clearly, similar remarks apply for the symbol abc in virtue of the associa¬ 
tive law for multiplication, M2. 

(6) The symbol “ = ” (which we read as “ equals ”) appears throughout the 
definition. What does it mean? More precisely, what is the meaning of a = bl 
For us, the meaning is simply that they are the “ same.” The symbols may 
look different from each other, but they refer to the same element of R —they 
are different names for the same element. Because of this, it is a rule of logic 
that if a appears in an expression, then it may be replaced by b without 
affecting the “ value ” of the expression. This procedure is often carelessly 
called “ substitution ”; “ replacement ” is a much more appropriate term. For 
example, if a — b 9 then, by replacement, a + c is the same element as b + c — 
that is, a + c = b -f c. 

It may be noted that the reflexive, symmetric, and transitive laws are valid 
for the notion of equality—that is: a = a for all a; if a — b 9 then b = a; if 
a = b and b = c 9 then a = c. 

One final illustration: Suppose a = b and c = d 9 then, by replacement, 
a + c = b + c and b + c = b + d 9 so, by transitivity, a + c = b + d. Of course, 
this is the familiar statement that adding equals to equals gives equals. Once 
this kind of thing has been stated in a single instance, we will not fuss about 
such “ commonly accepted rules of logic ” in the future. 

(7) Axiom A3 postulates the existence of an element, denoted by 0 and 
called zero or zero element, which may be added (on the right) to any element 
of R with no effect. It is this property which is the basis for calling 0 an identity 
for addition. Such an element 0 behaves like the element zero of Z, and this 
accounts for the notation “0” and the name “zero.” 

Incidentally, we do not yet know how many such zero elements there are 
in R. The phrase “there exists” in Axiom A3 asserts only that there is at 
least one zero element in R ; quite possibly there are several zero elements. The 
true state of affairs will be clarified soon. 

(8) Once an element 0 has been fixed there exists (according to A4) for 
each ae R an element a e R 9 called an inverse of a 9 such that a + a = 0. The 
question of how many inverses an element a has for addition will be answered 
shortly. 

(9) In virtue of the commutative law for addition, A5, we know that 
0 + a = a for all a e R and that a + a = 0. These properties were not stated 
as parts of A3 and A4, respectively, precisely because they are immediate 
consequences of the commutative law. 

Note that the axioms say nothing about the commutative law for multi¬ 
plication ; it may hold or it may not. 
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(10) The five axioms A1-A5 are concerned solely with addition, while the 
two axioms Ml and M2 are concerned solely with multiplication. There is 
some parallelism between the axioms for multiplication and addition in the 
sense that Ml corresponds to A1 and M2 corresponds to A2. This parallelism 
could be extended by introducing additional Axioms M3-M5 for multiplica¬ 
tion. At a later stage, we shall see what happens when one or more of the 
axioms M3-M5 is added to the axioms for a ring. 

The only connection between addition and multiplication in a ring is 
provided by the distributive laws, Dl. Note that because multiplication in a 
ring need not be commutative, we require two distributive laws. 

(11) As stated at the beginning of this section, the axioms for a ring are 
motivated by properties which hold in Z and Z m . Why this choice of axioms ? 
The proof of the pudding is presumably in the eating. As the discussion 
progresses, it will be seen that our choice of axioms for a ring is not a bad one. 
There are all kinds of rings, and computations in a ring behave according to 
many of the standard rules. If the axioms did not give “suitable results” 
(whatever this means) we would change them. 

2-1-3. Remark. We have made a big point about the axioms for a ring 
being chosen as properties that hold in Z and Z m . While it is clear that Z is a 
ring, it is not quite so obvious that Z m is a ring—the axioms need to be verified. 
Of course, it is meaningless to attempt to verify that Z m is a ring until the 
operations of addition and multiplication have been specified. Naturally, the 
operations are taken to be those defined in 1-7-14; namely, 

M-+I*L = l a + fe L » 

These definitions say, in particular, that we have closure for both addition 
and multiplication. 

The verification of the remaining ring axioms is straightforward. The 
associative law for addition says that 

(Hm+ IA.L) + |f|» = Hm + (IdL + Ldm) 

for all | a | m , |6 j m , | cj m in Z m ; it is valid because 

(Idm +[dm) + \f] m = l a + *L + l c L = K a + *) + c L > 


[dm + ([*]» + Ldm) = [dm + \ b + C L = \ a + ( b + C ')\ m • 


and 
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Similarly, the associative law for multiplication says that 

(K. • l*IJ • lit =lfJ« • d*i. • IfU 

for all | a\ m , \b\ m , [cj m in Z m ; it is valid because 

(hi • liJm) • 111 =lf*J =\^m 

and 

' (I*!- ' 111) =lfj m • L^Jm 

(Note that the proofs of the two associative laws in Z m rest on the fact that 
the corresponding associative laws hold in Z. As a matter of fact, the verifica¬ 
tion of each of the axioms in Z m boils down to the use of the corresponding 
axiom in Z.) 

The element |_0j m of Z m is an identity for addition, since [f*J m + [_0_| m = [ a \ m 
for all \a\ m e Z m . Every element of Z m has an inverse for addition—in fact, an 
inverse of an arbitrary element \a_\ m is the element 1 —a | m , since 
1 a L + 1 ~~ a L = [_2_|m • Of course, the commutative law for addition holds. 
Only the distributive laws remain. To show that 


• (lAjrn +Hm) = Hm ‘ 111 ’ lfj» 

for all |_f] m . , | c | m in Z m , we observe that 

Ifl- ' (Him + l£j«) = Hm • \ b + c l 

- W + ct . 

= ab + ac 

1 m 

and 


' Him +H, • [fjm = lj^.L +lf£jrn 

= |aft+acL . 

The other distributive law goes exactly the same way. This concludes the 
verification that Z m is a ring. 

It is rather cumbersome to denote an element of Z m by |_<zJ OT and to use 
such symbols for computation. The customary way to simplify the notation 
is to represent each congruence class by its smallest nonnegative element. 
Thus for |_0j m , |_l_| m ,..., \m -1 | m we write 0, 1, 2,... , m — 1, respectively. 
This is in keeping with what was done forZ 7 in 1-7-15. Addition and 
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multiplication in Z m may then be given completely by tables. For example, 
in Z 6 , addition and multiplication are given by: 


addition in Z 6 


+ 

0 

1 

2 

3 

4 

5 

0 

0 

1 

2 

3 

4 

5 

1 

1 

2 

3 

4 

5 

0 

2 

2 

3 

4 

□ 

□ 

n 

3 

3 

4 

5 

0 

1 

2 

4 

4 

5 

0 

D 

B 

5 

5 

5 

0 

1 

□ 

□ 

□ 


multiplication in Z 6 



0 

i 

2 

3 

4 

5 

0 

0 

° 

0 

0 

0 

0 1 

l 

0 

D 

B 

B 


B 

2 

□ 

B 

B 

B 


D 

3 

0 

3 

0 

3 

0 

3 


0 

Q 

0 

0 

0 

0 


0 

0 

Q 

0 

0 

0 


Similarly, the addition and multiplication tables for Z n are: 

addition in Z u 


+ 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

i 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0 

2 

2 

3 

4 

5 

6 

7 

8 

9 


0 

1 

3 

3 

4 

5 

6 

7 

8 

9 

10 


1 

2 

4 

4 

5 

6 

7 

8 

9 

Em 

0 

1 

2 

3 

5 

5 

6 

7 

8 

9 

59 

0 

1 

2 

3 

4 

6 

6 

7 

8 

9 

n 

Si 

1 

2 

B 

4 

5 

7 

7 

8 

9 

E3 

B 

l 

2 

B 

B 

5 

6 

8 

8 

9 

b 

0 

1 

2 

3 

B 

5 

6 

7 

9 

9 

10 

m 

1 

2 

3 

4 

5 

6 

7 

8 

10 

10 

0 

i 

2 

3 

4 

5 

6 

7 

8 

9 


multiplication in Z n 


• 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

■El 

0 

0 

0 

0 

B 

B 

B 

B 

0 

0 

B 


1 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 


2 

0 

2 

4 

6 

8 

B 

1 

3 

5 

7 

9 

3 

0 

3 

6 

9 

1 

4 

7 

B 

2 

5 

8 

4 

0 

4 

8 

1 

5 

9 

2 

6 

10 

3 

7 

5 

0 

5 

B 

4 

9 

3 

8 

2 

7 

1 

6 

6 

0 

6 

1 

7 

2 

8 

3 

9 

4 

10 

5 

7 

0 

7 

3 

B 

6 


9 

5 

1 

8 

4 

8 

0 

8 

5 

2 

10 


4 

1 

9 

6 

3 

9 

0 

9 

7 

5 

3 

1 

B 

8 

6 

4 

2 

10 

0 

10 

9 

8 

7 

6 

5 

4 

3 

2 

1 
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2-1-4. Definition. A ring R is said to be a ring with unity (or with identity 
or with “one”) when there exists an element ee R such that 

ea = ae = a for all ae R. 

A ring R is said to be commutative when 

ab = ba for all a, be R 

—that is, when the commutative law for multiplication holds. 

Under our scheme for numbering axioms, the commutative law for multi¬ 
plication would be M5. The existence of an identity for multiplication would 
be Axiom M3. Because one often deals with a ring with unity in which the 
commutative law for multiplication is not valid (such a ring would be called a 
noncommutative ring with unity) we require that the element e should serve 
as an identity both on the left and on the right. 

It is clear that Z and Z w are commutative rings with unity. Of course, 
| 1 L is a unity element of Z m . 

It is easy to exhibit many rings, but we shall defer the discussion of exam¬ 
ples to the next section. Here we shall undertake to derive a few of the ele¬ 
mentary consequences of the ring axioms. In particular, we shall see that a 
number of standard properties of Z are valid in any ring. Thus let us consider 
an arbitrary ring {i?, +, •}. It is understood that henceforth, in this section, all 
elements come from R and all statements are about R . 


2-1-5. Proposition. The zero element, 0, is unique. 


Proof : We know that a -f 0 = a for all a. Suppose there is another zero 
element—call it 0'; so a -f O' = a for all a. We must show 0 = O'. Since both 
0 and 0' are zero elements and since the commutative law for addition 
holds, we have: 


0 = 0 + 0' = 0' + 0 = O'. | 


2-1-6. Proposition. If a + b = a + c, then b = c. 


Proof : This is known as the cancellation law for addition. Because addition 
is commutative it is not necessary to state the cancellation law on the other 
side; more precisely, once the cancellation law stated above is proved then it 
is immediate that b + a = c + a implies b = c. 



98 


II. RINGS AND DOMAINS 


As for the proof, making use of replacement, we have 

a + b = a + c=>a + (a + b) = a + (a + c) 
=>(a + a) + b = (a + a) + c 
=>0 + b = 0 + c 
=>b = c. | 


2-1-7. Proposition. If a + b = a, then b = 0. 

Proof: The axiom about the zero element, A3, says that a + 0 = a for all a . 
The assertion here is a sort of weak converse; it says that if b is an element 
which serves as an additive identity for just one element a , then b = 0. The 
proof itself is trivial. We have a + b = a = a + 0, so by the cancellation law, 
b= 0. | 

2-1-8. Proposition. For each a , its additive inverse a is unique. 

Proof: We have a + a = 0 and must show that if a + a' = 0 (meaning that 
a' is an additive inverse of a ), then a' = a. But this is trivial, since a + d = 
a + a implies a! = a , by the cancellation law. | 

2-1-9. Proposition. The zero element is its own inverse under addition; 

in symbols, 

0 = 0. 

Proof: By definition of additive inverse, 0 + 0 = 0. On the other hand, 
0 + 0 = 0. Hence, by cancellation, 0 = 0. | 

2-1-10. Proposition. For each a , we have 

(a) = a. 

Proof: In other words, with respect to addition, the inverse of the inverse 
of an element is the element itself. As for the proof, by definition of inverse, 
a + (§} = 0. Of course, a + a = a + a = 0, so by cancellation, (a) = a. | 

2-1-11. Proposition. For any a, be R there exists a unique element xe R 

such that a + x = b; in fact, x = a + b. 
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Proof : A much more colloquial way to express this is to say: the equation 
a + x = b has a unique solution in R , namely x = a + b. 

It is clear that a + b is a solution, since a + (a + b) = {a + a) + b = b. 
As for uniqueness, if y e R is also a solution, then a + y — b = a + x 9 so by 
cancellation x = y. | 

2-1-12. Proposition. For any a , 

a • 0 = 0 = 0 • a. 


Proof : This tells us that 0 behaves “correctly” (meaning, as expected) 
with respect to multiplication. Because the basic characteristic of 0 is additive, 
and 0 is multiplied here by an arbitrary element a , one might expect the proof 
to depend on the connections between addition and multiplication—that is, 
on the distributive laws. In fact, we have 

#0 = #(0 - f ~ 0 ) = #0 + #0 

and then 2-1-7 permits the conclusion a0= 0. The proof of 0a = 0 proceeds 
in similar fashion. | 

2-1-13. Proposition. For any a and b , we have 

ab — ab = ab and ab = ab . 


Proof : In virtue of the preceding result, 

0 = £ 0 = a(b + b) = ab + ab . 

This says that ab is the additive inverse of ab; in symbols, ab =ab. In the 
same way, 

0 = 0 b = (a + a)b = ab + ab 

so that ab = ah This proves the first part. 

For any a and b we now know that ab = ab . Applying this to the elements 
a and 5, we have 


ab = ab = ab 

which proves the second part. | 

2-1-14. Proposition. For any a and b , we have 


a + b = a + b. 
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Proof : In words, this says that the inverse of a sum is the sum of the 
inverses. Let us give the proof in full detail. Consider 

(a + b) + (ci + 6) = (( a + b) + <z) + 5 
= {a + (a + b)) + 5 
= ((a + a) + b) + E 

= (0 + 6 ) + B 
= 6 + 5 
= 0 , 


which says that a + b = a + 5. 

The reason for the fussing at the beginning of the proof is that the symbol 
a + b + a + 5 has no meaning because we have no associative law for adding 
four elements. Instead, we must keep the parentheses and apply the associative 
law to three elements at a time. | 


2-1-15. Proposition. If R has an identity for multiplication, then it is 
unique. 


Proof : By hypothesis, there exists an element ee R such that ea = a = ae 
for all ae R. Suppose e' e R is also an identity for multiplication, so e'a = 
a = ae' for all ae R. Now consider ee'. Since e is an identity on the left, this 
equals e' 9 and since e' is an identity on the right, this equals e. Thus, 

e' = ee' = e. 

Note that the proof here goes just like the proof for uniqueness of the zero 
element. | 


2-1-16. Proposition. If R has a unity e 9 then ( e)(e ) = e and ea = a for all 
ae R. 


Proof. According to 2-1-13, e • e = e • e = e 9 and for any a 9 ea = ea = a. | 


2-1-17. Remark. Now, let us revert to the more common notation and 
write —a (called minus a) instead of a 9 and b — a (which is read as b minus a) 
instead of b + ( — a) — b + a. Thus, what is normally called “subtracting a 
from b ” is really adding the additive inverse of a to b. 
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Rewriting the preceding properties (namely, 2-1-9, 2-1-10, 2-1-13, 2-1-14, 
2-1-16) in terms of the minus notation, we have: 

(0 -0 = 0 , 

00 -(-a) = a 9 
(in) (- a)b = a(-b) = -(ab), 

(iv) (-a)(-b) = ab , 

00 -(a + Z>) = -a - b, 

(vi) (—e )(— e ) = e and (— e)a = —a when e is an identity for multiplica¬ 
tion. 

We chose not to introduce the minus notation at the start solely for peda¬ 
gogical reasons—so the notation should not lead the reader to assume or 
expect that certain properties of minus hold before they have been proved 
carefully. In general, the standard and familiar rules for computation with the 
minus sign also hold in a ring. For example, the meaning of a — b + c in a 
ring is clear, and it is equal to a + c — b. Additional examples will appear in 
the problems. 

Thus far we have seen that many properties of Z also hold in any ring. On 
the other hand, there are a number of extremely important properties of Z 
that need not hold in an arbitrary ring. For example, if ab = 0, where a and 
b are elements of the ring R, we cannot conclude that at least one of these 
elements is 0. This seemingly pathological situation, in which the product of 
two nonzero elements is zero, occurs in Z 6 —for here, |^J 6 # |JLb> Lib ^ I Q L 
and |_2] 6 • [3j 6 = Lib • In order to focus on this kind of situation, we make a 
definition. 

2-1-18. Definition. A nonzero element a in the commutative ring R is said 
to be a zero-divisor when there exists b # 0 in R such that ab = 0. A commuta¬ 
tive ring with unity element e ^ 0 and no zero-divisors is known as an 
integral domain; it is usually denoted by D. 

A few comments about the definition are in order: (1) According to the 
definition, only a nonzero element can be a zero-divisor; the element Ois not 
a zero-divisor. In particular, [2j 6 is a zero-divisor in Z 6 and [_0j 6 is not. 
(2) Note further that a zero-divisor was defined only in a commutative ring. 
If the ring were not commutative, it would be necessary to deal with ab = 0 
and ba = 0 separately, and to distinguish between right and left—a distinction 
which we choose to avoid. (3) The reason for the requirement e # 0 in an 
integral domain D is not mysterious at all. If e = 0, then for every a e D 

0 = 0 a = ea = a, 


which means that D consists of just the one element 0—and this ring is so 
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trivial that it may as well be excluded from the discussion. (4) The basic 
condition for an integral domain may be stated as: 

if ab — 0 and a ^ 0, then b — 0. 

In fact, if a # 0, then a is not a zero-divisor precisely when ab = 0 implies 
b = 0 . 

Obviously, Z is an integral domain, and it is easy to verify (by trial and 
error) that Z 5 and Z 7 are integral domains. On the other hand, Z 6 is not an 
integral domain. What about the commutative ring with unity Z m ? The full 
story is given by the next result. 

2-1-19. Theorem. The ring Z m is an integral domain o m is prime. 

Proof : In order to be completely accurate and avoid possible confusion 
we use the |_| m notation for elements of Z m . 

Suppose m is composite. Then there exist integers a and b such that 
m = ab, 1 < a <m, 1 < b < m. We have, therefore, = \rn J m = | 0 | m 

with |_«J m -f- |_0j m and \ b_\ m #[0j m —so Z m is not an integral domain. This 
proves: m composite => Z m is not an integral domain. Of course, this implica¬ 
tion is logically equivalent to: Z m is an integral domain =>m is prime. 

To prove the other half of the theorem, suppose m is a prime p; we must 
show that Z p is an integral domain. Suppose \_a\ p #L0j p and 1 a U b | n = | 0 | p ; 
then 


HpHJ p = ^1p => ^Ip = I^Jp 

=> ab = 0 (mod p) 

=> p\ab. 

Since \_a\ p #|_2J P we know that pXa* Because p is prime, we conclude that 
p | b —or what is the same thing, \_b\ p = |_0j p . Consequently, the arbitrary 
nonzero element [a\„ is not a zero-divisor (because there is no nonzero | b] p 
for which LfJ P L^J p =|_2 Jp) and Z P is an integral domain. This completes the 
proof. | 

2-1-20. Remark. Suppose R is a commutative ring with unity e # 0 in 
which the following property, known as the cancellation law for multiplication, 
holds: 

(1) If ab = ac and a # 0, then b = c. 

It is easy to see that the cancellation law is equivalent to the condition 

(2) If ab =0 and a^0, then b = 0. 
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In fact, if (1) holds, then ab — 0 = <z0, and by cancellation, b = 0—so (2) 
holds. Conversely, if (2) holds then ab = ac implies a{b — c) = 0, and by (2) 
we have b — c = 0—so b = c 9 and (1) holds. 

Of course, condition (2) says that an element a # 0 cannot be a zero- 
divisor—so condition (2) holds if and only if R is an integral domain. Conse¬ 
quently, we have proved: if R is a commutative ring with unity e # 0, then 
R is an integral domain <=> the cancellation law for multiplication holds. 

2-1-21 1 PROBLEMS 

1. In any ring, prove the following rules: 

{i) a + (b + c) = c + (a + b), 

( ii) ( a(b + c))d = {ab)d + a{cd), 

(in') ci -}-((£> + c) + d} = ((a + b) + c) + d = (a + b) + (c + d\ 

(iv) (a + b)(c + d) = {ac + be) -f {ad + bd) = {ac + ad) + {be + bd). 

2. In a commutative ring, show that 

(/) a{bc) = c{ba), 

{ii) a({b + c)d) = {ad)b + {ad)c 9 

{Hi) {a + b){a + b) = {a 2 + b 2 ) + {ab + ab). 

3. Construct addition and multiplication tables for Z 2 , Z 3 , Z 8 , and Z 10 . 

4. In Z n , compute 5 • 8, 5 • 9, 5 • (8 + 9), (5 • 8)9, 5(8 • 9), 9(8 • 5), 8(5 • 9), 
8(9 + 5), 9(8 + 5). 

5. In Z ljL , check the rules given in Problems 1 and 2 when a — 3, b = 7, 
c = 2, d = 9. 

6. Construct addition and multiplication tables for Z 12 . Find all pairs 
a, be Z 12 for which ab = 0, and all pairs a, be Z 12 for which ab — 1. 

7. Consider the set Z = {0, ± 1, ±2,...}. Define addition of two integers 
as usual, but define the product of any two integers to be 0. Is Z a ring 
with respect to these operations ? 

8. Exhibit a ring that has exactly two elements. If n is a positive integer, does 
there exist a ring with exactly n elements ? 

9. In an arbitrary ring, does a = b imply a = bl 

10. In any ring, verify that 

(0 (a-b) + (c-d) = (a + c)-(b + d), 

{ii) —{a — b) = b — a, 

{iii) {a - b) - {c - d) = {a + d) - {b + c), 

{iv) a{b — c) = ab — ac. 
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11. In any ring, verify that 

(0 -{(-a)b) = ab=-(a( < -b% 

( ii) {a -F b){c — d) = (ac — bd) + {be — ad) = {ac 4- be) — {ad + bd ), 

{in) {a — b){c + d) — {ac — bd) + {ad — be) = {ac 4- ad) — {be + ad ), 

{iv) {a — b){c — d) = {ac + bd) — {be + ad). 

12. In Z 13 verify the assertions of Problems 10 and 11 when a = 5, b = 10, 
c = 1, d = 11. 

13. In any ring, evaluate: 

(0 {a 4- b) 2 , {ii) {a - b) 2 , {iii) {a - b + c){a + b — c), 

{iv) {a + b ) 3 , 0) {a - b ) 3 , {vi) {a-b + c ) 3 . 

14. If the ring R is commutative, evaluate all the expressions listed in Problem 
13. 

15. Find a necessary and sufficient condition on the ring R that 

{a + b){a — b) = a 2 — b 2 for all a, be R. 

What is meant by a necessary and sufficient condition ? 

16. For n > 4, discuss the possible meanings for a t + a 2 + mmm + a n and 
a t • a 2 * • • a n in a ring. 

17. In an integral domain, if a is an idempotent (meaning that a 2 = a ), then 
a is 0 or e . Can you give an example of a commutative ring with unity in 
which there is an idempotent not equal to 0 or e. 


2-2. Examples 

For us, the rings of primary interest are Z and Z m . Nevertheless, it is 
useful to give many examples of rings—some of which may never be referred 
to again. These should deepen our understanding of the concept of a ring and 
provide us with a certain amount of intuition about rings in general. 

2-2-1. Example. Let us examine several number systems that are especi¬ 
ally familiar. 

(/) Consider the set R of all real numbers. It would take us too far afield to 
discuss exactly what is meant by a real number. (As a matter of fact, to do it 
properly would probably require a full semester.) Instead, we shall assume 
here that the real numbers are “ known ”—just as we assumed in the preceding 
chapter that the integers Z are known. One may wish to think of real numbers 
as the points on the so-called number line. Under the customary operations 
of addition and multiplication, it is immediate that R is a ring because the 
required axioms are simply “ known ” properties of real numbers. Even more, 
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R is surely a commutative ring with identity 1, and since the product of two 
nonzero real numbers cannot equal 0, it follows that R is an integral domain. 


(ii) Consider the set Q of all rational numbers. By this is meant the set of 
all numbers that can be expressed as the ratio of two integers—in symbols, 


Q = 



m, n e Z, n ^ 0 


}■ 


Of course, we are assuming that the rational numbers are “ known ”; because 
of this words like “ fraction ” or 66 ratio,” and symbols like mjn , have meaning. 
Naturally, we know how to add and multiply rational numbers (that is, 
“fractions”). The usual rules are given by 


m 1 m 2 171^2 + m 2 n i 

«i n 2 n±n 2 

m x m 2 m l m 2 

n i n 2 n^n 2 

These rules may be taken as the definitions of addition and multiplication in 
Q. Under these operations, Q is a ring because all the axioms for a ring are 
properties that have always been taken for granted in the set of rational num¬ 
bers. However, one can be much more formal and actually verify the ring 
axioms. Let us sketch this quickly. 

To verify closure for addition we must show that if m i !n 1 and m 2 /« 2 are 
elements of Q, then so is Qn i /n 1 ) + (m 2 /n 2 ). Because m u m 2 ,n u n 2 are 
integers, so are m{ti 2 + m 2 and « 1 « 2 • In addition, n i n 2 # 0 because n t ^ 0 
and n 2 # 0. Therefore, 

171^2 + m 2 n t 

n t n 2 

is indeed an element of Q, and Axiom A1 holds. In similar fashion, m 1 w 2 is an 
integer and n 1 n 2 is a nonzero integer, so m i m 2 /n i n 2 e Q, and it follows that 
Axiom Ml, closure for multiplication, holds. Note that the verification of 
these axioms (and the remaining ones also) depends on properties of Z. 

The associative law for addition, A2, is valid because 


( m x m 2 \ m 3 /m 1 n 2 + m 2 « 1 \ m 3 

n 2 ) n 3 \ n±n 2 } n 3 

(m x « 2 + m 2 n 1 )« 3 + 

(«1« 2 )«3 


m l n 2 n 3 + m 2 n 1 n 3 + m 3 n l n 2 


n^n 2 n 3 
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is equal to 


Uli + (Oh + = — + / m 2^3 + ^3» 2 \ 

n i \n 2 n 3 ) n i \ n 2 n 3 ) 

= mi(n 2 n 3 ) + ^ i (jn 2 n 3 + m 3 n 2 ) 
fh (n 2 n 3 ) 


m l n 2 n 3 + m 2 n l n 3 + m 3 n l n 2 
n 1 n 2 n 3 

The associative law for multiplication, M2, is clear because 
!m l m 2 \ m 3 m l m 2 m 3 m i lm 2 m 3 \ 

\«1 n 2 ) '*3 « 1 ^ 2«3 n i \ n 2 »3 / 

The remaining axioms for addition, A3-A5, are easy: 0/1 is clearly a zero 
element, an inverse of mjn is surely (— m)/n , and the commutative law holds 
since 


m 1 m 2 m 1 n 2 + m 2 n 1 m 2 m 1 
n 1 n 2 n 1 n 2 n 2 n t 

Finally, it is straightforward to verify the distributive laws, D1—and this is 
left to the reader. 

Thus, Q is a ring. In addition, 1/1 is an identity for multiplication and the 
commutative law for multiplication holds, so Q is a commutative ring with 
unity. Even more, Q is an integral domain because the product of two nonzero 
rational numbers is not zero. 

All this has gone very smoothly, but there are some difficulties that have 
been glossed over. For example, 0/1 is a zero element, but so is 0/2, or 0/3, or 
0/n for any 0. Since Q is a ring, this apparently contradicts the uniqueness 
of the zero element (see 2-1-5). The difficulty is caused by our carelessness in 
defining the elements of Q. A rational number is not just a ratio mjn —we 
know for example that 

1 2 3 ,0 0 0 

2 = 4 = 5="' and 1-2 = '"-;= '' 

In other words, there are really many distinct names or expressions for the 
same rational number. Consequently, it is necessary to go back to the begin¬ 
ning, define Q carefully, and verify all the axioms. At a later stage, we shall 
develop Q with great care and precision. For the time being, we shall take it 
for granted that Q is an integral domain and that the difficulties mentioned 
above can be overcome. 
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( iii ) Consider the set C of all complex numbers. By a complex number 
we mean a symbol or expression a + bi where a and b are real numbers and 
i = V — 1. Thus 

C = {a + bi | a g R, b e R}. 

Of course, every complex number has a unique expression; in other words, 

a + bi = a' + Vi <=> a = a' and b — V . 

The standard definitions for addition and multiplication in C are 

{a + bi) + {c + di) = {a + c) + (b + d)i 9 

(a + bi){c + di) = {ac — bd) + {be + ad)i. 

Naturally, we choose these definitions of the operations in C precisely because 
they express the way we have always added or multiplied complex numbers. 
Clearly, C satisfies closure for both of these operations. Furthermore, it is 
straightforward to verify that addition and multiplication are commutative, 
that both associative laws hold, and that the distributive laws hold; all these 
verifications rest ultimately on the properties of R. For example, the associa¬ 
tive law for multiplication follows from 

({a + bi)(c + di))(e +fi) = ({ac — bd) + {ad + bc)i){e + fi) 

— ({ac — bd)e — {ad + bc)f ) 

+ ({ac — bd)f + {ad + bc)e)i 

= {ace — bde — adf — bef) 

+ {acf — bdf + ade + bce)i 

and 

{a + bi)({c + di){e + fi)) = {a + bi)({ce — df) + {cf + de)i) 

= {ace — adf — bef — bde) 

+ {bee — bdf + acf + ade)i. 

The element 0 = 0 + 0/ is an identity for addition. An additive inverse of 
a + bi is ( — a) + (— b)i . [Strictly speaking, the symbols — {a + bi) or —a — bi 
have no meaning. After C has been shown to be a ring, the minus can then be 
introduced, and both of these symbols represent the inverse of a + bi.] 
Moreover, the element 1 = 1 + 0/ is an identity for multiplication. We see, 
therefore, that C is a commutative ring with unity. 

Now, let us prove that C is an integral domain. (Surely everyone 66 knows ” 
that if the product of two complex numbers is 0, then at least one of them must 
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be 0. The point is that we can prove this “fact” using only properties of R.) 
For this consider 

(a + bi)(c + di) = 0 with a + bi # 0. (*) 

We must show c + di = 0—in other words, c = d = 0. Multiplying out, we 
have 


(ac — bd) + (be + ad)i =0 = 0 + 0/ 

which says 

ac — bd = 0, 
be + ad = 0. 


(**) 


Consequently, a(ac — bd) = aO = 0 and b(bc + ad) = 0. Because multiplica¬ 
tion in R is commutative, adding these expressions yields 

(a 2 + b 2 )e = 0. (#) 

Similarly, from (**) we obtain b(ac — bd) = 0 and a(bc + ad) = 0—so subtrac¬ 
tion yields 

(a 2 + b 2 )d= 0. (##) 

Now, a + &i # 0, by hypothesis; so not both a and b are 0. Therefore, by 
well-known properties of real numbers, it follows that a 2 + b 2 # 0. Since R 
is an integral domain, (#) and (# #) imply that c = d= 0. Thus, C is an 
integral domain. 

An alternative proof that C has no zero-divisors goes as follows. From (*) 
we have 


(a — bi)((a + bi)(c + di)) = (a — bi) 0 = 0 

and hence 


(a 2 + b 2 )(e + di) = 0. 

Since a 2 + b 2 ^ 0, there is an element \/(a 2 + b 2 ) (by which is meant an 
inverse of a 2 + b 2 for multiplication) in R. Of course, Rc C, so, in particular, 
1 /(a 2 + b 2 ) e C. Consequently, multiplication by l/(a 2 + b 2 ) gives c + di = 0. 
The key to this version of the proof is some insight as to what goes on in C. 


2-2-2. Example. Consider the set of all even integers. It may be denoted 
by 


2 Z = {2n | n e Z}. 

With respect to the usual operations of addition and multiplication of integers, 
2 Z is clearly a commutative ring with no zero-divisors. Note that in verifying 
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the required axioms, several of them—for example, the two associative laws, 
the two commutative laws, and the distributive laws—are automatic, because 
they hold for any integers and hence surely for even integers. However, 2 Z 
does not have an identity for multiplication, so according to the definition 
2-1-18, 2Z is not an integral domain. 

In similar fashion, for any integer m > 1, consider the set of all integers 
which are divisible by m. This set is denoted by 

m Z = {mn | n e Z}, 

and it consists of all multiples of m in Z. Again, with respect to the usual 
operations with integers, m Z is a commutative ring with no unity element and 
with no zero-divisors. 

2-2-3. Example. Suppose we have a set consisting of two elements, 
i? = {0,1}. Let us define addition and multiplication in R by the rules 

04-1 = 1+0=1, 0 + 0 = 1 -f 1 = 0, 


0'1 = 1'0 = 0'0 = 0 , 1-1 = 1 . 

It is easy to verify, by explicit enumeration of the various cases for each 
axiom, that R is an integral domain. This is not surprising because R is 
“really” Z 2 which, according to 2-1-19, is an integral domain. 

In somewhat analogous fashion, let R be the three-element set R = 
{a , b , c} and define addition and multiplication in R according to the follow¬ 
ing tables. 


addition in R multiplication in R 


+ 

a 

b 

c 


a 

b 

c 

a 

c 

a 

b 

a 

c 

b 

a 

b 

a 

b 

c 

b 

b 

b 

b 

c 

b 

c 

a 

c 

a 

b 

c 


It is tedious to verify that {i?, -F, •} is a ring—for example, each associative 
law has 27 different cases, each of which must be checked. Furthermore, 
multiplication is commutative, c is an identity for multiplication, and b is the 
zero element. In fact, R is an integral domain. 

The incisive way to look at this example is to set up the correspondence 

a<-> 2, b<-> 0, 1. 

Then R becomes the set {0, 1, 2}, and the operations as given by the tables 
become those of Z 3 . In other words, this example is simply a disguised form 
of the integral domain Z 3 . 
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2-2-4. Example. Consider the complex number / = V — 1, and the set of 
all complex numbers a -f bi for which both a and b are integers. This subset 
of C may be written as 

{a + bi\ a, be Z}. 

Under the standard operations of addition and multiplication of complex 
numbers, it is straightforward to verify that we have an integral domain. 
It is known as the domain of Gaussian integers. We shall denote it by Z[/], 
or Zy~l],oi Z+ ZV^T 

In the same way, the reader may check that 

Q[i] = {a + bi\a 9 b e Q}, 

which we also denote by Q [>/ — l] or Q + Q\J — 1, is an integral domain. Of 
course, 

Z[i] cz Q[i] c C. 

2-2-5. Example. Consider the real number y/2 and the subset z[>/2] = 
Z + Z yjl of R defined by 

Z[V2] ={a + by/2\ a, be Z}. 

The usual way to add and multiply these elements is 

(a + by/2) + (c + dyj 2) = (a + c) + (b + d)y/2, 

(q + by/ 2)(c + dy/l[) = (cic + 2 bd) + (be + ad)y/2, 

and, of course 

a + by/2 = c + d-J 2 o a = c and b = d. 

One checks easily that with these standard operations z[>/2] is an integral 
domain. 

In the same way, we see that 

Q[V2] ={a + bj2\a,be Q} 

(which contains Z[^2] and is contained in R) is an integral domain. 

We did not carry out the verification of the ring axioms for the preceding 
examples in detail, nor did we even indicate how to show that Z[/] and Z [\/2] 
have no zero-divisors. In part, the intention was to induce the reader to 
supply the details himself. However, as we shall see momentarily, most of the 
mechanical work involved in verifying the axioms was really superfluous. 
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H-2 -6. Definition. Suppose { R , + , •} is a ring and let S be a nonempty 
subset of R . We say that S' is a subring of 7? if, with respect to the operations 
+ and • given in R , S becomes a ring in its own right—in other words, S is a 
subring of R when { S , + , •} is a ring. 

To illustrate, R itself is always a subring of R , and so is (0), the subset con¬ 
sisting of the zero element alone. Furthermore, let us consider the inclusions 
Z [i] c= Q[/] cC, 2Zc Zc QcRcC, Zc Z>/2 <= Q>/2 c= R. In virtue 
of the earlier examples, each term or symbol which appears is a ring. Because 
in each and every case we use the “standard ” addition and multiplication, it 
follows that each of the rings listed is a subring of all the listed rings which 
contain it. 

Consider further the ring Z, and let S be the subset of all odd integers, 
S={2n+ 1| ne Z}. 

Then S is not a subring of Z; in fact, S is not closed under addition, nor does 
it have a zero element. 

Finally, suppose S is the set of all nonnegative real numbers 
S = {a | a e R, a > 0}. 

Is S a subring of R ? One checks easily that S satisfies all the axioms for a ring 
except the existence of inverses for addition, so S is not a subring of R. 

To decide, in general, if a subset S in a ring R is a subring it is not necessary 
to verify all the ring axioms in S. Our next result provides criteria for deciding 
whether or not S is a subring. / 


2-2-7. Proposition. Suppose Sis an nonempty subset of the ring {i?, +, •} 
then the following are equivalent: 

(0 S is a subring of R , 

C ii ) a,beS=>a + beS, —aeS,abeS, 

(Hi) a, b e S => a — b e S, ab e S. 

Proof: According to condition (//), in order to show that S is a subring 
it suffices to verify that S is closed under the operations, in R , of addition, 
taking of additive inverse, and multiplication. According to condition (/«), it 
suffices to verify that S is closed under subtraction and multiplication ia R . 
Now for the proof. 

(/) => (ii). By hypothesis, S is a ring when addition and multiplication of 
elements are carried over from R . If a, be S , then a + b and ab (the sum and 



112 


II. RINGS AND DOMAINS 


product of elements of R) are in S , because S' is a ring. It remains to show that 
— a, which is the additive inverse of a in R , belongs to S. 

The difficulty here is this. The zero element of R is 0 e R. Since S is a ring, 
it has a zero element, but we do not know that it is the element 0. Thus let 
0' e S denote the zero element of S , and in similar fashion let a' e S denote the 
additive inverse of a in S. We must show that — a = a'eS. 

For a e S we have 


a + 0 = a = a + 0' 


so, by cancellation in R, 0 = O'. In other words, the zero element of R belongs 
to S, and R and S have the same zero element. Consequently, 

a + a f = 0' = 0 = a + (—a) 

and we conclude (again, by cancellation in R) that a' = — a —thus completing 
this part of the proof. 

(ii) =>(iii). Suppose we are given a,beS. Because (ii) holds, we have 
abe S and also —beS. In order for (iii) to hold, it remains to show that 
a — be S. But now, a, be S implies a , —be S , and then by (ii) their sum 
a -4 (—b) is in S —that is, a — b e S, and this part of the proof is complete. 

(iii) => (/). We must verify all the ring axioms for S , under the assumption 
that (iii) holds. The associative laws for addition and multiplication hold for 
elements of S because they already hold for arbitrary elements of R. Similarly, 
the commutative law for addition and the distributive laws hold for elements 
of S. In addition, according to (iii), S is closed under multiplication. It 
remains to show: Al, S is closed under addition; A3, S has a zero element; 
A4, every element of S has an additive inverse in S . To do all this, consider 
any a, be S. Applying (iii) for the special case b — a we have 0 = a — a = 
a —be S —so A3 holds. Now, applying (iii) for the elements 0, a e S gives 
— a = 0 — a e S —so A4 holds. Finally, returning to arbitrary a, be S we 
have —beS and by (iii), a + b — a — (—b)e S — so Al holds. This completes 
the entire proof. | 

2-2-8. Remark. The utility of the preceding result may be illustrated 
by using it to show that z[>/2] is a ring. Taking it for granted that R is a 
ring with respect to the usual operations, we note first that z[>/2] = 
[a + b^Jl | a, be Z} is a nonempty subset of R. Moreover, the operations of 
4* and • defined for z[V^] (see 2-2-5) are precisely those for real numbers. 
Now, for elements a = a + by]2 and /? = c -4 d\jl of Z [V^] it is immediate 
that a —jle zfV^] and a fie z[V2]— so, according to condition (iii) of 
2-2-7, z[>/2] is a subring of R, and in particular z[>/2] is a ring in its own 
right. 
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Furthermore, 1 = 1+ 0>/2 is clearly an identity for multiplication, and 
multiplication in z[V2] is commutative because the commutative law for 
multiplication holds in R. Thus, z[>/2] is a commutative ring with unity; 
but how we know that it is an integral domain? To settle this question it is 
not necessary to imitate the procedure used in 2-2-1 to verify that C has no 
zero-divisors. Instead, we note that because the product of two nonzero real 
numbers cannot be zero, the same property holds for elements of the subset 
Z [V2] of R. Consequently, z[V2] has no zero-divisors and it is an integral 
domain. 

In general, this argument shows that any subring of an integral domain has 
no zero-divisors. However, the subring need not be an integral domain be¬ 
cause (just as for 2Zc Z) the subring may not have an identity for multiplica¬ 
tion—and according to the definition of integral domain, 2-1-18, it must have 
a unity element. 

2-2-9. Examples. Consider Z x Z, the set of all ordered pairs of integers; 
in other words, 


Z x Z = {(a, b)\a,be Z}. 

This is a special case of a general notation according to which for any sets 
X and Y we write 


lx Y={(x,y)\xeX,ye Y}. 

There are many ways to define addition and multiplication in Z x Z. For 
example, let us put 


(a, b) + (c, d) = (a + c, b + d\ 

(*) 

(a, b) • (c, d) = (ac — bd , be + ad). 

The burden of performing the arithmetic work required for verifying that 
Z x Z is a ring with respect to these operations is left to the reader. For ex¬ 
ample, the associative law for multiplication says that for any a = (a, b ), 
P = (c, d ), y = (e,f) in Z x Z we have 

(ocj?)y = a(0y) 

and this relation holds because both sides turn out to equal 


(ace — bde — bef — adf y bee + ade + acf — bdf ). 

Even more, the reader may wish to verify that this is an integral domain. 
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In the same way, there is no difficulty in checking that Z x Z is a ring 
with respect to the operations 


(ia , b) + (c, d) = (a + c, b + d ), 

(< a , &) • (c, d) = (tfc + 2 bd, be + ad). 


(**) 


For example, one of the distributive laws says that if a = (< a , b ), /? = (c, d), 
y = (e,f) in Z x Z we have 


a(/? +y) = a/? + ay 
and this relation holds because both sides equal 

(ac A- ae A- 2bd + 2 bf, be A- be A- ad A- af). 


Here, as in the preceding case, (0, 0) is the zero element and the additive 
inverse of (< a , b) is ( — a, —b). Again, the reader may try to verify that Z x Z 
with the operations (**) is an integral domain. Clearly, (1, 0) is an identity for 
multiplication and multiplication is commutative. What about the property of 
“ no zero-divisors ” ? For this, given (< a , b) ^ (0, 0) = 0 (meaning that not both 
a and b equal 0), it must be shown that if 

(a, b) • (c, d) = (0, 0), 


then (c, d) = (0, 0). The reader is challenged to do this. 

Still another way to make Z x Z into a ring is to define 

(a, b) + (c, d) = (a A- c 9 b A- d\ 

(< a , b) • (c, d) = (ac — bd , ad A- be — bd). 


(***) 


In this case, verification of the ring axioms is still straightforward, but it is 
somewhat more difficult to show that Z x Z is an integral domain. 


It may well be asked what these examples, in which Z x Z is made into a 
ring, are about? Where do they come from? The explanation is surprisingly 
simple. For example, consider the integral domain Z[/]. If we choose to 
associate with an element a + bi the ordered pair (< a , b ), this determines a 1-1 
correspondence 

a + bi±->(a , b) 

between the sets Z[i\ and Z x Z. Upon rewriting the rules for addition and 
multiplication in Z[/] in terms of this new ordered-pair notation, we observe 
that they turn out to be precisely the rules given by (*). Consequently, 
Z x Z, with operations defined by (*), is just Z[i] in disguise—so it is an 
integral domain, and all the work the reader may have done in showing that 
Z x Z with operations (*) is an integral domain was superfluous! 
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Similarly, Z x Z with operations given by (**) is simply another version 
of z[x/2]—that is, with (< a , b) corresponding to a + bsjl —so it is an integral 
domain. 

The story underlying the situation (***) is a bit more complicated. Con¬ 
sider the complex number 

-1 + V^3 


Then 


2 —A — \ — ^ , 3 

cd =--- and co = 1. 

2 

Thus co is a cube root of unity (the other cube roots of unity being c o 2 and 1), 
and because 0 = co 3 — 1 = (co — l)(co 2 + co + 1) we know that 

co 2 + co + 1 = 0 

(a fact which may also be verified by a direct computation). Let us write 
Z[co] = Z + Zcd = {a -f bco\a, b e Z}. 

This is surely a nonempty subset of the integral domain C, and we already 
know how to add and multiply elements of Z[co]; namely, 

(a -}- bco) + (c -f" dco) = (a + c) + (b + d)oo 

and using the fact that co 2 = — 1 — co, 

(a + bao)(c + dco) = (ac — bd) + (ad + be — bd)oo. 

Thus, Z[co] is closed under the addition and multiplication from C. In addition, 
— (a + boo) = (—«) + ( — b)oo is an element of Z[co], so it follows from 2-2-7 
that Z[co] is a subring of C. Of course, 1 = 1 + Oco is an identity for multiplica¬ 
tion, and because Z[co] is contained in C, multiplication is commutative and 
there are no zero-divisors. Therefore, Z[co] is an integral domain. Now, instead 
of a + boo let us write ( a , b ), thereby setting up a 1-1 correspondence between 
Z[co] and Z x Z. When addition and multiplication is Z[co] are rewritten in 
the notation of Z x Z, the result is precisely the rules (***). This proves that 
(and indicates why) Z x Z is an integral domain in the situation (***)! 


2-2-10. Example. Let Ji( Z, 2) denote the set of all 2x2 matrices with 
entries from Z. Of course, by an element A e Ji( Z, 2) we mean an array 



a l 1 > 2 f 1 > a 2 2 ^ Z 
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with two rows and two columns. The notation is arranged so that a tj is the 
entry in the /th row and the yth column of A (of course, here the only choices 
for i and j are 1 and 2). If B is also an element of Z, 2), then we write 


B = 



and define the sum A + B and product AB as 


/ a 1 j 0i2\ (b\ i 

\^21 ^22/ \^21 

(<*11 a l2\ . /^ll 

\ a 21 a 22 / \&21 


bl2 \ = ( ai1 + bli a12 + bi2 \ 

&22 / \ a 2 1 + t >21 a 22 + ^ 22 / 

bl2.\ _ + a 12^21 ^11^12 + fl 12^22\ 

^ 22 / 1 ^ 21 ^ 11 +^ 22^21 ^ 21^12 + ^ 22 ^ 22 / 


Thus, addition involves simply adding corresponding entries; for example, 

(5 7) + (-1 1) = (4 b )- 

Multiplication is more complicated; the scheme for carrying it out is often 
called row-by-column multiplication. To obtain the i, j entry of AB —that is, 
the element in the /th row and /th column of the product matrix AB —one 
takes the /th row of A and theyth column of B , multiplies term by term and 
takes the sum of these products. An example may further clarify the meaning 
of these words. 


P 3 W _1 2 + 3 W -5 5 ) 

\5 1) \-l 1 ) \ —5 — 7 5 + 7/ \ —12 12/ 

It is understood, of course, that two matrices are equal when they are 
identical term-by-term; in other words, 

A = B o cin = bn, tf 12 = bi 2 > <*21 := ^ 2 i> <*22 ~ b 22 • 

It is clear from the definitions that Z, 2) is closed under both addition 
and multiplication. The associative and commutative laws for addition hold 
in Z, 2) because they hold in Z. The matrix 

-G 0 ) 

is clearly a zero-element—since A -f 0 = A for all A e Z, 2). The matrix 
A has an additive inverse—namely, 

-A = (~ an ~ ai2 \ 

\-a 21 ~a 22 ) 


—for surely A + ( — A) = 0. 
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The associative law for multiplication says 

(AB)C = A(BC) 

for all A, B, C e Z, 2). It is not obvious at all, and the only way to verify 
it is by brute force. Let us carry out the details. 

K ^ll 01 2 \ (b 1 1 ^12^ 

«21 022/V>21 ^22/J\^21 ^22 / 

__ l a l 1^1 1 + 012^2 1 011^12 + 012 ^ 22^/^11 ^ 12 ^ 

\02 1^1 1 + 022^2 1 021^12 + 022 ^ 22 / \ C 2 1 ^ 22 / 

__ l( a l 1^1 1 + 012^2 l) c l 1 + ( 011^12 + 012^2 2)^2 1 ( 011^11 + 012 ^ 2 l )^12 + ( 011^12 + 01 2 ^ 22 ) c 22 ^ 

\(02 1^11 + 02 2^2 l) c l 1 + (02 1^12 + 02 2^2 2)^2 1 (02 1^1 1 + 02 2^2 l)Cl2 + ( a 2 1^12 + 02 2^2 2)^2 2/ 

while 

fall 012\r^ll b 12 \fan C l2 \] 

\021 022 / L \^21 ^ 22/\ C 21 C 22 /J 

_/011 012^/^11^11 + ^12^21 ^11^12 + ^12^22^ 

\021 022 / \^2 1 C 1 1 + t > 22^2 1 ^2 1 ^ 12+^2 2^2 2 / 

__ /01l(^ 11^11 + ^1 2 c 2l) + 012(^2 \ c l l + ^2 2<^2l) 011 (^ll c 12+^12^2 2)+ 012(^2 1^12 + ^2 2<^2 2)\ 

\02 l(^l 1^11 + ^12<^2l) + 02 2(^2 1<^1 l + ^2 2^2l) 021 (^11^12 + ^1 2^2 2)+ 02 2(^2 1<^12 + b 2 2^2 2)/ 

and then one checks easily that these two end-result matrices are equal. 

The distributive laws, both of which must be verified, say that 

A(B + C) = AB + AC and (B + C)A = BA + CA 

for all A, B, C e Z, 2). The details may safely be left to the reader. 
Furthermore, if we write 


/ = 



then a straightforward computation gives I A = AI = A for all A e J7{ Z, 2)— 
so I is an identity for multiplication. 

From all this we conclude that */#(Z, 2) is a ring with unity. Is this ring 
commutative? Roughly speaking, if the reader chooses any two elements 
A, Be J7{ Z, 2) at random, he will find that AB ^ BA. For example, taking 

Mio)• -n 

we have 


AB = 0, BA = B. 


Except for showing that multiplication in J7{ Z, 2) does not satisfy the com¬ 
mutative law, this example touches on other questions—such as zero-divisors 
or cancellation—but we shall not pursue such matters here. 
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The ring Z, 2) has many subrings. Consider, for example, the non¬ 
empty subset 

S — {A G Z, 2) | Q} 2 = ^21 = ^22 = 
of Jl{ Z, 2). In other words, S consists of all elements of form 


If we also take an element 


B 


(o o)’ 
= (o o)’ 


ae Z. 


be Z 


from S , then clearly 

* — (-0 *“)■ "-(??)■ 


Thus, A — B and AB have the “ proper form ” and are therefore elements of S . 
This proves that S is a subring. In fact, S behaves very much like the ring Z. 
Consider next the set T of all elements of Z, 2) which are of form 

(»“«)• 

Thus, 

Ui) 

is an element of T, and 

G ") 

is not. Taking the difference of two elements of T, we have 

(a ~ b \ - ( c ~ d \ =( a ~ c “( b ~ d )\ 

\6 a) c) \b — d a — c) 

—so the result is an element of T (because it is of the proper form), and T is, 
therefore, closed under subtraction. As for the product of two elements of T f 
we have 

la —M Ic — d\__/ac — bd —ad—bc\ 
a) \d cj \bc + ad bd + ac / 

The resulting matrix is an element of T —so T is closed under multiplication. 
It follows that T is a subring of t #(Z, 2). Even more, the reader can easily 
check that Tis a commutative ring with unity. 
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Now, Z, 2) can be generalized in several ways. One way is to take the 
entries of our 2x2 matrices from some arbitrary ring R , instead of from Z. 
This set may be denoted by 2). It is a ring—the verification is exactly as 

for Jt{ Z, 2). 

In addition, we may consider Ji{R , n )—by which is meant the set of all 
n x n matrices with entries from R —for any n greater than or equal to 2. Thus, 
an element of Jt{R, n ) is of form 

a 12 * ’ ’ a lj * * * a ln 
a 22 a 2j a 2n 

& i2 &ij * &in 

< ln2 "' <*nj <*nn 

Upon defining addition of n x n matrices to be componentwise and multipli¬ 
cation to be “ row-by-column ” it can be verified (a lot of bookkeeping is 
involved in keeping the subscripts straight, especially in the associative law for 
multiplication) that Jt(R y n) is a ring. 

2-2-11. Example. Suppose a ring R is given and consider the product set 
R x R = {(. a , b)\a,be R}. If we define addition and multiplication “com¬ 
ponentwise ” in R x R —namely, 

( ci , b) + (c, d) = (a + c, b + d ), 

{a, b) • (c, d) = ( ac , bd\ 

then it is trivial to check that R x R becomes a ring. Everything goes smoothly 
precisely because R is a ring. We denote this new ring by R (2) orR®R, and 
refer to it as the direct sum (or as the direct product) of R with itself. The zero 
element of R ® R is obviously (0, 0) and the additive inverse of (a, b) is 
(-a, -b). 

Does R © R have an identity for multiplication ? This depends on R. If R 
has an identity e , then obviously ( e , e) is an identity for multiplication in 
R © R. On the other hand, if R has no identity, then neither does R ® R. 
Turning to the commutative law for multiplication, it is immediate that R © R 
is a commutative ring if and only if R is a commutative ring. Note, however, 
that under no circumstances can R © R be an integral domain—in fact, if 
0 is an element of R , then (a , 0) is a zero-divisor in R © R, since 
(a, 0)(0, a) = (0, a)(a, 0) = (0, 0). 

Now, let us look at the set 

R f = {(a,b)eR®R\b = 0}. 
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In other words, R is the subset of R © R consisting of all elements of form 
(< a , 0). Because R is obviously closed under subtraction and multiplication, 
we see that R is a subring of R © R In a sense, R is just a copy of R; for the 
second coordinate of elements of R is always 0, and one might choose to 
ignore the second coordinate. 

Similarly, 


R — {(a, b) e R ® R | a = 0}, 

the subset of R ® R consisting of all elements whose first coordinate is 0, is a 
subring of R®R. It too is a “version” of R under the correspondence 
(0 ,b)++b. 

Still another subring of R ® R which behaves like R is the diagonal A; by 
this we have in mind 

A = {(a, b) e R® R\a = b) = {(a, a) | a e R}. 

What about generalizing the direct sum R® R = R (2) l For any integer 
n > 2, the set of all w-tuples of elements of R becomes a ring, which we denote 
by R (n \ when the operations are defined componentwise. In other words, 

R (n) = {(a u a 2 ,..., a„) \ a t e R, i = 1,..., «} 
and the operations are 

(a u a 2 ,...,a n ) + (b l9 b 2 ,..., b n ) = (a ± + b u a 2 + b 2 ,..., a n + b n \ 

(a^ a 2 ,...,a n )- (b u b 29 ... 9 b n ) = (a t b i9 a 2 b 2 ,..., a n b n ). 

Then jR (n) is called the direct sum (or the direct product) of n copies of R . One 
often denotes it by R ® • • • © R, but then it is necessary to specify exactly 
how many components (that is, copies of R) are involved. 

Incidentally, in talking about R (n) the requirement that n> 2 is not essen¬ 
tial. One may surely speak also of jR ( 1) , the ring of 1-tuples, which is clearly R 
itself. 

One may go a step beyond w-tuples and make the set of all infinite se¬ 
quences of elements from R into a ring by defining addition and multiplica¬ 
tion componentwise. This ring is denoted by 



!* 

? 

II 

cT 

‘f 

* > Cltt’ * 

1 = 1,2, 3,...} 

with operations 



(#i ,a 2 ,.. 

.,a n ,...) + (b u b 2 ,. 

. *, b n ,. 

..) = (a t + bi, a 2 + b 2 ,..., 

(a u a 2 ,., 

■ •a n ,...) • (b u b 2 ,.. 

• > > * 

..) = (a l b l ,a 2 b 2 ,...,a n b n ,. 
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Note that for any w, R (n) may be viewed as a subring of R (cc) . More precisely, 
for fixed n , we consider the set 

{fo, a 2 , ...,a n ,a n+l ,...) ei? (oo) |a„ +1 = a n+2 = ••• =0} 

—in other words, we take those elements of R (co) for which all coordinates 
after the nth are 0. This is clearly a subring of jR ( oo) , and surely it is a copy of 
R (n) . 

There is really no need to restrict the discussion of direct sum to the case of 
a single R. Thus, if we have n rings R l9 R 2 , ..., R n , which may or may not 
be distinct, then by the direct sum (or direct product) R t ® R 2 © * * * © R n we 
mean the ring whose underlying set is the product set 

R t x R 2 x • • • x R„ = {( a u a 2 ,...,a^)\a i eR i , i = 1,..., n) 

(in other words, we take all w-tuples in which the /th coordinate comes from 
R t , for each i = 1,..., n) and in which the operations are componentwise. 
Of course, we also write, simply, 

R l ®---®R„ = {(a,, a 2 ,..., a n ) | a t e R t , i = 1,..., «}. 

Naturally, the same thing can be done for an infinite number of rings 
R iy i = 1, 2, 3, .... 

2-2-12. Example. In the preceding example we have seen one method for 
constructing new rings from old ones. Here we shall describe another pro¬ 
cedure for constructing new rings. The method, which is extremely important, 
turns out to have certain connections with the method used in 2-2-11. 

Suppose there are given an arbitrary ring R = {<z, b, c ,...} and an arbi¬ 
trary set X = {Xy y, z, w ,Consider the set of all mappings from X into R. 
We denote this set by Map(A", R). An element of Mzp(X, R) is a mapping or 
function from X into R and is denoted by /: X-> R. Of course, a mapping 
or function here is a “rule,” / say, according to which there is assigned to 
each xe X an element f(x)e R; symbolically, 

/: x-> /(x), xeX. 

This “rule”/may be given by some kind of a “formula” or it may simply 
be a set of words that permit us to determine f(x) when x is given. 

A quick illustration: Suppose the set X consists of three elements 

X = {x u x 2 ,x 3 } 

and the ring R is Z 2 = {0, 1}. What is Map(A", Z 2 )? First of all, an element 
/e Map(X, Z 2 ) is determined by specifying elements f(x l ),f(x 2 ) and f(x 3 ) of 
Z 2 . Since each of f(x l ),f(x 2 ),f(x 3 ) must be either 0 or 1, it follows that there 
are eight mappings from X into Z 2 . These may be labeled by ./i,/ 2 ,.4 ,/ 5 , 

fa ,/ 7 ,/ 8 (or in any other way the reader prefers) with the following definitions. 
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/l 

fliXi) = 

0, 

fiix 2 ) = 

o, 

fiix 3 ) = 

0. 

fl 

flix l) = 

0, 

fiix 2 ) = 

o, 

fi(x 3 ) = 

1. 

fz 

flix l) = 

0, 

f 3 (x 2 ) = 

1, 

f 3 (x 3 ) = 

0. 

u 

Mxi) = 

1 , 

f*ix 2 ) = 

o. 

Mx 3 ) = 

0. 

fs 

/s(*l) = 

o, 

fsix 2 ) = 

1, 

fsix 3 ) = 

1. 

fe 

feix i) = 

1, 

feix 2 ) = 

o, 

feix 3 ) = 

1. 

fl 

Mxi) = 

1, 

fli x 2 ) = 

1, 

fji x 3 ) = 

0. 

fs 

fsiXl) = 

1 , 

fsixi) = 

1, 

fsi x 3 ) = 

1. 

It needs to be i 

emphasized that functions 

are 

not strangers to us. We have 


been dealing with them in all courses in mathematics. The functions under 
consideration usually came from Map(R, R)—for example, the sine function 
jc-> sin x , or any polynomial function such as x 3x 5 — 2x 4 + x 2 — x + 4, 
or the exponential function x -► e x . 

Returning to the arbitrary set X and ring R , we observe that for 
/, g e Map(A", R) the definition of equality for functions says that/= g if and 
only if f(x) = g(x) for all x e X —in other words, two functions are considered 
equal (or the same) o they take the same values for every choice of xe X. 

Our objective is to make Map(A", R) into a ring in a natural way, so 
for any /, g e Map(A", R) we must define f + g and/• g. To do this, we need 
to specify what f + g and f-g do to every x e X —that is, we must give the 
“value” of /+ g and /• g at every x e X. Naturally, we do this precisely in 
the expected manner; first, one “evaluates” both / and g at x and then adds 
or multiplies the results. In symbols 


if + 9){x) =f(x) + g{x), 
if- ff)(x) =f(x) • g(x). 


for all xe X. 


Note that both /( x) and g(x) are in R , so they can indeed be added and multi¬ 
plied. On the other hand, there is no need for any kind of structure on the set A". 

Now, let us verify that Map(Z, R) is a ring with respect to these operations. 
In view of the definitions, Map(A", R) is clearly closed under both addition and 
multiplication. The two associative laws, (/+ g) + h =/+ (g + h) and 
(/* 9) * h =/• (g • h) for all f,g, he Map(A", R% are trivial. For example, to 
prove the latter, we observe that for xe X, 


(if- 9) - h)ix) = ((/• g)ix)) • h(x) = (fix) • g(x)) ■ h(x), 
(f-ig- h))ix) = f(x) ■ ((g ■ h)(x)) =fix) ■ (g(x) ■ h(xj), 


and these are equal because multiplication in R is associative. [Note that we 
are using + and • for the operations in both R and Map(Z, R). There is little 
danger of confusion, especially because everything goes smoothly.] 
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If we define the map 0: X-+ R by 

0(x) — 0 for all x e X 

—meaning that the mapping 0 takes the value 0 in R for every x e X —then 
for any / e Map(A", R) 

if + 0)0) =/0) + 0O) =/0) + o =/0) 

for all x e X, so f+ 0 =/and 0 is a zero element. 

For /e Map(A", R ), we seek an additive inverse —/ (or /). If we define 
-/by 

(-/)O) = - (/O)) for all xe X 

—meaning that we find f(x) and then take its additive inverse — (/(*)) in 
R —then —fe Map(A", R ), and/+(—/) = 0 because 

(/+ (-/))0) =/0) + (-/)0) =/0) -/O) = 0 = 0O) 

for all xe X. (The reader should justify each equal sign carefully.) 

The commutative law for addition, / + g = g + /, is trivial. Finally, the 
two distributive laws, f(g + h) =fg +fh and (g + h)f = gf + hf\ hold; for 
example, the first one is valid because 

(f(ff + h))(x) =f(x)(g(x) + h(x)) 

=f(x)g(x) +f(x)h(x) 

= (Mx) + Uh){x) 

= ifg +fh) O) 

for all xe X. This completes the verification of the ring axioms in Map(A", jR); 
the reader should note that the verification of each axiom depends ultimately 
on the fact that R is a ring. 

To illustrate, let us look at the ring Map(Z, R) in the special case men¬ 
tioned earlier where X = {x u x 2 , * 3 }, R= Z 2 = {0, 1}. It is easy to perform 
the operations in this ring. For example, to compute / 3 +/. we observe that 

if 3 +A)(Xi) =f 3 (x i) +/ 4 (*i) = 0+1 = 1, 

(f 3 +A)i X 2 ) ~ f 3 i X 2 ) + 74 (^ 2 ) = 1 + 0 = 1 , 

if 3 +f*){x 3 ) =f 3 (x 3 ) +fjx 3 ) = 0 + 0 = 0, 

and consequently,/ 3 +/ 4 =/ 7 . Similarly, to find/ 3 '/ 4 we observe that 

(/ 3 / 4 )(*i) =f 3 (x l Wx l ) = 0-1=0, 

(f 3 A)(x 2 ) =f 3 (x 2 )f 4 (x 2 ) =1-0 = 0, 
if 3 f 4 )(x 3 ) =f 3 (x 3 )A(x 3 ) = 0-0 = 0, 

and consequently, / 3 -f 4 =f t . More generally, we may safely leave it to the 
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reader to check the accuracy of the following tables for addition and multipli¬ 
cation in the eight-element ring under consideration. 


addition in Map (X, Z 2 ) multiplication in Map ( X , Z 2 ) 


+ 

A 

A 

/j 

A 

fs 

A 

fl 

fs 


A 

fl 

fs 

A 

fs 

fs 

fl 

fs 

A 

A 

A 

/3 

A 

fs 

fe 

fl 

fs 

A 

fl 

fl 

fl 

A 

fl 

fl 

fl 

A 

fl 


/. 

A 

A 

fs 

A 

fs 

fl 

fl 

fl 

fl 

fl 

A 

fl 

fl 

fl 

fl 

A 

fz 

A 

A 

A 

A 

fs 

A 

fs 

fs 

fl 

fl 

fs 

A 

fs 

fl 

fs 

fs 

A 

A 

A 

A 

A 

fs 

A 

fs 

fs 

J 4 

fl 

fl 

fl 

A 

fl 

A 

A 

A 

fs 

A 

h 

fl 

fs 

A 

A 

A 

A 

fs 

fl 

fl 

fs 

Jl 

fs 

fl 

fs 

fs 

U 

A 

A 

fs 

fl 

A 

A 

fs 

fs 

fs 

fl 

fl 

A 

A 

fl 

fs 

A 

fs 

A 

A 

fs 

A 

A 

A 

fs 

/. 

A 

fl 

fl 

fl 

fs 

A 

fs 

A 

A 

fl 

fs 

fs 

fl 

A 

fs 

A 

fs 

A 

A 

fs 

fl 

fl 

fs 

A 

fs 

fs 

A 

fs 


We close out our discussion with a matter of real interest—namely, how 
Map(Jf, R ) can be interpreted as generalizing the direct sum rings R (n) or 
jR (o0) . More precisely, suppose the set X has n elements—so there is no harm 
in writing X = {1, 2, 3,..., n}. Consider any/e Map(A", R), where the ring R 
is arbitrary. For each i — 1,2is an element of R , and we may 
associate with / the «-tuple (/(l),/(2), ...,/(«)) which is, of course, an ele¬ 
ment of R (n) . Conversely, given any element (a l9 a 2 , ..., a„) e R (n) we can use 
it to define the element /e Map(2f, R) for which f{i) = a t , i = 1and 
then the element of R (n) associated with fi s precisely (a l9 a 2 , ..., a n ). There¬ 
fore, there is a “natural” 1-1 correspondence between sets 

M3ip(X,R)<->R (n) 

according to which 

/<-►(*, i a 29 ... 9 a„) 9 

when/(/) = a t for / = 1,2,..., n. Suppose further that g e Map(Z, R) corre¬ 
sponds to the w-tuple (b l9 b 2 ,..., b n ) in R {n) —thus 

g<->(b u b 2 , ...A), 

and g(i) = b t for i = 1,...,«. Which elements of R (n) correspond to / + g and 
fg 9 respectively? Since for each / = 1,..., n 

(/+ #)(0 =/( 0 + 0 = «; + and (fg)(i) =f(i)g(i ) = 


it follows immediately that 


and 


f + g++{a x + b l9 a 2 + b 2 ,..., a n + b n ) 9 


fg (aib u a 2 b 2 ,..., a n b n ). 

The significance of this is that under the 1-1 correspondence between the 
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rings Map(A", R ) and R (n \ addition and multiplication also correspond—in 
other words, the ring operations in the two rings correspond to each other. 
This leads to the following conclusion. 

When X = {1, 2,..., n} (or any set with n elements), the ( . 

rings Map(A", R) and R in) are “ essentially ” the same. ^ ' 

In particular, for the concrete example discussed earlier where X = {x i9 x 2 , x 3 } 
and R = Z 2 , we see that the rings Map(A", Z 2 ) and Z 2 ® Z 2 ® Z 2 are 
essentially the same. The correspondence between them is given by 

A <- (o, o, o), f 2 <- (o, o, i), / 3 <- (o, i,o), / 4 (l, o, 0), 

fs (0, 1, 1), / 6 <- 0,0, 1), / 7 <-(l, 1,0), / 8 <-(l, 1, 1). 

The result (*) can be extended. Namely, if X is the set of all positive inte¬ 
gers, X = {1, 2, 3,...} and the ring R is arbitrary, then clearly the rings 
Map(A", R) and R (co) are essentially the same. The correspondence is 

f(2)J(3) 

2-2-13. Example. Consider an arbitrary set X = {a 9 b 9 c 9 ... 9 x 9 y 9 .. .}, 
which may be finite or infinite. Let £f{X) denote the set of all subsets of X . 
Elements of Sf{X\ which are subsets of X 9 are denoted by A, B, C,.... In 
particular, the empty set 0 and the set X itself are elements of £f(X). 

Can we make SP(X) into a ring? Perhaps the most obvious approach is to 
define for A, Be £f(X). 

A + B = A u jB, 

A-B = An B. 

We recall that the union A u B consists of those elements x of X which belong 
to A, or to j B, or to both, and the intersection A n B consists of all those ele¬ 
ments x of X which belong to both A and B. 

Now, £f{X) is surely closed under addition and multiplication. Both 
associative laws clearly hold; in fact, A u (B u C) = (A u B) u C because 
each side consists of those elements x which belong to at least one of the sets 
A, B 9 C, and A n (B n C) = (A n B) n C because each side consists of those 
elements x which belong to all three of the sets A 9 B, C. Since A +0 = A for 
all A e £f(X), we see that 0 is a zero element. However, given A ^0 there is 
no B for which A + B = A u B =0 —so additive inverses do not exist, and 
out attempt to make £f{X) into a ring fails. 

Let us try another approach. As is customary, we let ^4 — i? (which is read 
as 66 A minus B ”) denote the set of all elements which belong to A but not to 
B\ in symbols, 

A — B = {xe X\ x e A 9 x $ B}. 
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Then define operations in £f(X) by 

A + B = (A- A\ 

A-B = An B. 

The meaning of A + B, which is known as the symmetric difference of A and B , 
may be understood from the accompanying Venn diagram, 



where A + B is precisely the total shaded area, which consists of the union 
of A — B and B — A. Another way to characterize the shaded areav4 -f J?is as 
the set of all elements x which are in A or in B but not in both—so clearly 

A + B = (A - B) u (B- A) = (AvB)- (A n B ). 

Now, let us examine the ring axioms for Obviously we have 

closure for both addition and multiplication. The associative law for multipli¬ 
cation A • (B • C) = (A • B) • C is clear, because it asserts that A n (B n C) = 
(A n B) n C . The associative law for addition, A + (B + C) = (A + B) + C, 
is rather difficult to prove in a formal way; however, the reader may convince 
himself of its validity by use of the accompanying Venn diagram 



in which both A + (B + C) and (A + B) + C turn out to be the shaded 
area. Since A + 0 = A for alM e 0 is a zero element. Then given 
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A e £f(X) we seek B such that A + B = 0; trying B = A, we have 

A+A = (A-A)u(A-A) = 0v0 = 0 

so A is always its own inverse under addition. [Note: One may choose to 
denote the additive inverse of A by A, so A = A , but it would be a mistake to 
denote it by — A because all kinds of things can go wrong. For example, once 
Sf{X) is shown to be a ring, our notational conventions call for writing 
B + ( — A) (which equals B + as B — A, which is not the same as the differ¬ 
ence B — A defined above. Thus, for us the unadorned symbol — A will have 
no meaning; it will be permissible only when it appears in a difference such 
as B — A.] Of course, addition and multiplication are commutative, and X is 
an identity for multiplication since 

AX=XA = XnA = A for all A e X ). 

Finally, because multiplication is commutative, we need only check one 
distributive law—say A(B + C) = AB + AC. This amounts to showing that 

A n ((B — C) u (C — B)) = (AnB — An C)v (A nC—An B) 

—a nontrivial task at this stage. However, the reader may convince himself 
that the distributive law holds because both A(B + C) and AB -F A C represent 
the shaded area in the following Venn diagram. 



All this shows that £f(X) is a commutative ring with unity. 

In connection with the foregoing discussion two questions may be raised. 
Firstly, where do the definitions of the operations, especially addition, in Sf(X) 
come from? Why do these operations work, in the sense that S?(X) becomes a 
ring? What is really going on in £f(X)l Secondly, even though a picture may 
be worth a thousand words, the use of Venn diagrams to prove the associ¬ 
ative law for addition and the distributive law should not be construed as 
providing a formal algebraic proof. Can we give more 66 honest ” proofs ? These 
questions will be answered momentarily by showing how S?(X) can be 
viewed in another, more incisive, way. 
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Consider our fixed set X and the two-element ring Z 2 = {0, 1}. For each 
given A e X ), let us associate with it the element Xa (to be read as chi sub 
A) of the ring Map(Jf, Z 2 ) which is defined by 

, v fl if xe A, 

Xa( x ) |q if x$A. 


In particular, % e A o Xa( x ) = 1. We call Xa the characteristic function of the 
set A ; it takes the value 1 for x e A and the value 0 for x e X — A. On the 
other hand, given /e Map(A", Z 2 ), let us associate with it the set 

A = {xeX\f(x) = \) 

—in other words, A (which is more appropriately denoted by A f ) is the set of 
all elements x for which/takes the value 1. What is the characteristic function 
X A of this set A ? The answer is obvious from the definitions—namely, Xa — /• 
We see, therefore, that there is a 1-1 correspondence 


between the sets 


a ++Xa 


y(X)+->Map(X, Z 2 ). 

Consequently, because Map(A", Z 2 ) is a ring (in fact, it is a commutative 
ring with unity) a natural way to make £f(X) into a ring is to “ transfer ” the 
operations bodily from Map(A", Z 2 ). To do this, we must [because every 
element of Map(A", Z 2 ) is of form Xa f° r some A e X )] look first at Xa + Xb 
and Xa Xb l that is, we must examine addition and multiplication in Map( X, Z 2 ) 
in terms of this new notation for elements of Map(A", Z 2 ). According to the 
definition of multiplication in Map(A", Z 2 ) we have, for xe X, 

0(a Xb)(x) = Xa(x)Xb(x). 

So, it follows from the rules for multiplication in Z 2 that the right-hand side 
is 1 only when both Xa( x ) an d Xb( x ) are 1. Thus, 

(y y¥x)-i ! XxeAr^B, 

<Xt Xb)W - -j 0 if x $AnB, 


and, therefore, 


Xa X B = X A nB- (*) 

In other words, because Xa Xb is the function which takes the value 1 for 
xe A n B and 0 elsewhere, it is none other than Xa^b^ the characteristic 
function of A n B. 



2-2. EXAMPLES 


129 


Furthermore, for xe X, we have by the definition of addition in 
Map(X, Z 2 ), 

(Xa + Xb)(x) = Xa(x) + Xb(x) 

and the right-hand side equals 1 if and only if one of the terms is 0 and the other 
is 1. It follows that 

(y +y)Ort=| 1 Kxe(A-B)Kj{B-A), 

1a + Xb,[x> |0 if x$(A- B)kj(B- A), 

so that Xa + Xb is precisely the characteristic function of the set 
(A — B) u (B — A); in symbols, 

Xa + Xb = X(a-b)u(b-a) • (**) 

Now, we are ready to transfer the operations from the ring Map(A", Z 2 ) 
to the set Sf(X ). To transfer addition, A + B must be taken as the subset of 
X , which corresponds to the element Xa + Xb of the ring Map(A", Z 2 ). In 
virtue of (**)> the “correct” definition is 

A + B = (A - B) u (B - A). 

Similarly, to transfer multiplication, AB must be taken as the subset of X 
which corresponds to the element Xa Xb of the ring Map(A", Z 2 ). But in virtue 
of (*), the subset corresponding to XaXb — Xa^b precisely A n B —so the 
“correct” definition is 


A-B = A nB. 

Because of the way the operations of addition and multiplication were trans¬ 
ferred from Map(A", Z 2 ) to £f(X), it is clear that £f(X) becomes a ring which 
is essentially the same as Map(A", Z 2 ). 

2-2-14. Exercise. This exercise is concerned with the so-called “calculus 
of sets.” All our sets, A, B, C,..., are assumed to be contained in a universal 
set X . In addition to using the symbols u, n, — dealt with in 2-2-13, we write 
A c for X — A and call it the complement of A . 

To prove relations such as inclusion or equality between sets, Venn 
diagrams are suggestive, but they do not constitute a proof in themselves. One 
must give a formal, logical argument using words and symbols. For example, 
the elemental way to show that A a B involves verifying that every element of 
A belongs to B (that is , xeA=>xe B )—a picture will not do. Also, a standard 
way to prove A = B is to show that A a B and B c= A. (Our notation A cz B, 
it should be noted, does not exclude the possibility A = B). Of course, after 
the elementary properties have been proved the more complicated ones can 
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often be proved directly (that is, by “ computational ” techniques) rather than 
having to deal with individual elements. Now prove: 

For any sets A, B , C we have 


(/) If A cz B and B cz C, then A cz C, 

(ii ) A n B a A and A n B cz B, 

(i iii) A cz A u B and B cz A u B, 

(iv) AczBnCoAczB and A cz C, 

(v) AvBaCoAczC and B cz C, 

(i?i) A n B = A <=> A cz B, 

(vii) A u B = A <=> B cz A, 

(viii) A cz B o B c cz A c , 

(/x) ^4 u = X and A n A c = 0, 

(x) (A c y = A, 

( xi) AczB c oAnB = 0 , 

(xn) Az>B c oA'uB = X, 

(xiii) A — B = A — (A n B) = A n B c , 

(xiv) A n (5 u C) = (A n B) u (A n C), 

(xi?) Au(BnC) = (AuB)n(Au C), 

(xw‘) de Morgan’s laws: (A n B) c = A c u J? c , u i?) c = A c n B c , 
(xuk) (v 4 — J5) n (A — C) = A — (B u C), 

(xwii) (A — B) v (A — C) = A — (B n C), 

(xi*) (A - B ) u (B - A) = (A u B) - (A nB) = (An B) c n (A u J?). 


2-2-15 / PROBLEMS 


1. By verifying the axioms, decide which of the following are rings with 
respect to the standard operations: 


(0 {0 + fee Z}, 

(ii) {a + b 3 + c^/ 9 |tf, b, c e Z}, 
(/») {a + 6>/21 a, ft e 2Z}, 


(«>) 


(tO 



a, b e Z 


a, b e Z 


2. By verifying the axioms, decide which of the following are rings with 
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(0 

00 

(Hi) 


respect to the standard operations: 

m 
n 

m 
? 
m 
n 


m, n e Z, (w, n) = 1, n > 0, n odd 
m e Z, p a fixed prime, r > ll, 


m,n e Z, n ^ 0, (m, n) — 1, (n, p) = 1 >, p is a fixed prime. 


What is the significance of the fact that rational numbers m/n can be 
written in more than one way ? 

3. Suppose {/?, -f, •} is a ring. If we define a new 66 multiplication ” operation 
° in R by: a ° b = 0 for all a, b e R , is {i?, +, o} a ring? 


4. In Z x Z define addition and multiplication by 

(a, b) + ( c , d) = {a + c, + d), 

(<z, &) • (c, d) = (pc + bd , ad + be + bd ). 


Verify that this is a commutative ring with unity. Is it an integral domain ? 

5. Consider Z with the usual operations + and •. Now, let us define new 
operations i. and o (to serve as addition and multiplication, respectively) 
by 

aA.b = a + b+ 1, aob = a + b + ab. 

Show that {Z, ±, o} is a commutative ring with unity—the zero element 
is — 1, and the identity for multiplication is 0. Is it an integral domain? 

6. Suppose { R , +, •} is a ring. 

(/) If we define a i. b = a • b and a o b = a + b, is {i?, i., o} a ring? 

(//) If we define a o b = a + b + a - b, is {R, +, °} a ring? Does it have 
a unity ? 

(m) If {i^, +, *} has a unity e , and we define 

= a + 6 — e, ao b = a + b — ab 
is {jR, ±, o} a ring? Is it commutative? Does it have a unity? 

7. Let S' denote the set of all positive integers. Is S a subring of Z? 


8. Is Z 2 a subring of Z? Why? How about Z m ? 

9. All the sets considered in Problems 1 and 2 above may be viewed as 
subsets of R. Which ones are subrings of R? This provides a simple way 
to settle the question raised in Problems 1 and 2—namely, which ones 
are rings with respect to the standard operations. 
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10. In each of the following cases, find as many subrings of the given ring as 
you can: 

(0 00 Z 6 , (iii) Z 12 . 

Are there any other subrings ? In each case, can you prove that you have 
found all subrings? 

11. If m is an integer greater than or equal to 0 show that m Z = {mn |«e Z} 
is a subring of Z. Show further that every subring of Z is of this form—or 
to put it another way, there are no other subrings of Z. 

12. (0 Let m be a nonzero integer. Is mQ = {ma\ae Q} a subring of Q 

under the usual operations ? 

00 Except for the subrings (0) and Q of the ring Q, can you find any 
other subrings ? 

13. 0) Let co = (1 + y/5)l2 and consider the subset 

Z[a>] = {a + boo | a, b e Z} 

of R. Show that Z[co] is a subring of R, and even more that Z[a>] 
is an integral domain. 

00 Observe that Z[a>] is essentially the same as the ring of Problem 4 
above, and conclude thereby that the ring of Problem 4 is an integral 
domain. 

14. 0) Let R t and R 2 be subrings of the ring R , then R t n R 2 is a subring of 

R. 

(ii) Furthermore, the intersection 

n Ri = Ri n R 2 n • ■ ■ n R n 

i = 1 

of a finite collection of subrings R l9 R 2 ,..., R n is a subring. 

(iii) Does this result carry over to the intersection of an infinite collection 
of subrings? 

15. If R 1 and R 2 are subrings of the ring R show by example that Ri u R 2 
need not be a ring. In fact, R 1 kjR 2 is a subring R x czR 2 or R 2 czR l . 

16. Fix an element x^ 0 of the ring R. 

(i) Is the set xR = {xa\a e R} a subring of R1 How about Rx — 
{ax | a e i?} ? 

(ii) Is A x = {a e R | ax = 0} a subring of R? 

(iii) Consider the set A = {a e R \ aR — (0)}—so a e A if and only if ab = 0 
for all b e R. Is A a subring of R1 How is A related to the A x 9 s? 
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17. The center 3 of the ring R is defined by 

3 = {a e R | ab = ba for all b e R}. 

Show that 3 is a subring of R. 

18. (/) Suppose R t and R 2 are subrings of the ring R and we write 

Ri + R 2 = fai + a 2 \a 1 e R u a 2 e R 2 }. 

Thus, R x + R 2 , which is called the sum of R x and > consists of all 
elements of R which arise by adding an element of R x and an element 
of R 2 . (Of course, R t + R 2 has nothing to do with the direct sum or 
with the sum, +, of subsets of R.) Is R t + R 2 a subring of R ? 

(ii) In the case where R = Z, consider two subrings m Z and n Z. Describe 
the set m Z + n Z. Is it a subring of Z? 

19. Use the fact that Z[i] is a subset of C to show it is an integral domain. 

20. Suppose D is an integral domain with unity e 9 D' is an integral domain 
with unity e\ and D' c D —or more precisely, D' is a subdomain of D. 
Show that e f = e. In other words, an integral domain and any subdomain 
have the same unity. 

21. Suppose S is a nonempty subset of an integral domain D . What properties 
must one check in order to guarantee that S is an integral domain (that 
is, a subdomain of D) ? 

22. Given the following matrices in J{{ Z, 2), 



compute: 

(/) A + B, A + (2, A + Z), i? + (2, 

(//) A + R "h C, R + C + Z), 

(Hi) A-B,B-C,C-D , 

(m) CZ>, Z)C, 

(y) BCD , 

(w) /I(5 + C), /Ii? + /1C, i?(C - Z>), i?C - i?Z). 

23. Verify the two distributive laws in Jt( Z, 2). 

24. If jR is a ring, verify that Jt(R, 2) is a ring with respect to the “ natural ” 
operations. 

25. Prove, in detail, that Jt( Z, 3) is a ring. 



134 


II. RINGS AND DOMAINS 


26. Consider the following elements of Jl( Z, 4) 



and compute: 

(0 A + B, A + C, B + C, A + (B + C ), ( A+B) + C, 

(ii ) A — B, A — C, B — C, 

(Hi) AB, AC, BC, BA, 

(iv) ( AB)C, A(BC), 

(v) B(A + C), BA + BC, 

(vi) (C - A)B, CB - AB. 

27. In 2-2-10, we showed that 

t -{{1 

is a subring of Z, 2). Show that T is a commutative ring with unity. 
Is it an integral domain ? 

28. Which of the following are subrings of Z, 2): 


( 0 { 

(o o) 

00 { 

{b o) |“- 6 e Z }' 

0'0 | 

(o o) I “ s z ) • 

(iv) | 

(o b M a - b - ceZ 

(v) | 

,(o t)\‘- beZ }- 
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(vi) 

(oii) 

(viii) 

(ix) 


(C SI- 

IC ru¬ 
le ru¬ 
le 91* 


e 3 Zj, 

be z), 
be zj, 

be Z, 


Can you find additional subrings of Jf( Z, 2) ? 

29. For any complex number a = a + bi let us write a = a — bi. Show that 


e-{(j 


is a subring of Jf(C, 2). It is known as the ring of quaternions. 
Consider the four elements 


e = 





of Q , and verify that 

X 2 = jjl 2 = v 2 = — e, 

Xji — v, jiv = 2, v2 = jU, 

/i2 = — v, vfi = —2, 2v = — jU. 


30. For any matrix A = 


in Z, 2) define ^4* e Z, 2) by 


and call it the transpose of A. Show that for all A, B e ,//( Z, 2), 

04 + B)* = A* + B*, 

( AB )* = B*A*. 


31. Consider the set X = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and the subsets 

^ = {0, 1, 2, 3}, 5 = {2, 3, 5, 7, 9}, C = {4, 6}, 

D = { 3,4,6,8,91, £={1,3,5,8}, £=0, 

G={0}, H={ 4,6}, 7={6}. 
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(For any subset Y of X we shall write Y c for X — Y, and refer to it as the 
complement of Y.) Find the following sets: 

(1) Au B,B<u C,Eu F,Au G, H kj C,Fu G, X u E. 

(2) An D, An C, Bn F, An G,Hn I, EnG,XnE. 

(3) A x H, A x I, C x G, C x C, H x C, G x I. 

(4) A — B, B — C, E — F, F — G, C — G, H — C, I — H. 

(5) A C ,C C ,F C , G c ,I c ,(B c ) c ,(Ey. 

(6) A u A c , D u D c , F u F c , G u G c , I u I c . 

(7) B n B C ,C n C C ,F n F c , G n G c , H n H c , 

(8) B - (B n E), A-(B-C),D-(C u H), C u (A - G). 

(9) (A n B) u C. A n(Bu C), (A u C) n (B u C). 

(10) ^ n (B u C), (A nB)v (An C). 

(11) (H n Ff, H c u I c . 

(12) (E u G) c , E c n G c , 

(13) A - (B u C), (A-B)n(A- C). 

(14) A-(Bn D), (A — B) v (A — D). 

32. Suppose A, B, C are sets for which A u B — A u C and A n B = 
4nC = 0; show that B = C. 

33. If X is a set with n elements, show that Map(A", Z 2 ) has exactly 2" elements. 
How many elements are there in .9\X), the ring of all subsets of XI How 
many elements are there in Map(X, Z 3 ) ? 

34. Let X be a set with three elements. Give names to the eight subsets of X, 
and construct addition and multiplication tables for the ring £A(X). 

35. Suppose a set X and a ring R are given, each with more than one element 
— #(X)> 1, #(R)> 1. Then 

(i) R has a unity element <=> Map(X, R) has a unity element. 

(ii) R is a commutative <=> Map(Z, R) is commutative; and in this 
situation, Map(Z, R) always has divisors of zero. 

36. (i) Consider the ring Map(A r , R) and fix an element x 0 e X. Let 

M Xo = { / e Map(J£, R) | f(x 0 ) = 0}. 

Show that M Xo is a subring of Map(A", R ). Can you find other sub¬ 
rings ? 

(ii) When R= Z 2 , Map(A", Z 2 ) may be interpreted as Conse¬ 

quently, M Xo may be interpreted as a subring of £f(X)\ which one? 

37. Describe some subrings of £f(X ). 

38. If R t and R 2 are rings, verify carefully that R t ® R 2 is a ring. Moreover, 
R x © R 2 has an identity <=> both R x and R 2 have identities, and R x © R 2 
is commutative o both R x and R 2 are commutative. 
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39. Suppose A" is a set with two elements. Discuss the “ sameness ” of the 
rings Map(Z, Z 2 ), Sf(X), and Z 2 © Z 2 . 

40. ( i ) Show that S — {0, 2, 4, 6} is a subring of Z 8 . Construct addition and 

multiplication tables for S. 

(i ii ) Show that T = {0, 4, 8, 12} is a subring of Z 16 . Construct addition 
and multiplication tables for T. 

(Hi) Construct addition and multiplication tables for the four-element 
ring Z 2 © Z 2 . 

(iv) In Z 2 © Z 2 , keep the same addition, and take multiplication to be 
trivial—that is, the product of any two elements is 0 = (0, 0). This 
is a ring (see Problem 3), which we denote by ( Z 2 © Z 2 )°. Construct 
addition and multiplication tables for this ring. 

(v) The ring Ji( Z 2 ,2) of all 2 x 2 matrices with entries from Z 2 has 
16 elements. Find a four-element subring which is not commutative. 
Make addition and multiplication tables for it. 

(vi) Of course, Z 4 is also a ring with four elements. Make tables for it. 

(vii) Can you find any additional rings with four elements ? 

41. Construct several distinct rings with 72 elements. 

42. For every positive integer m , there exists at least one ring with m elements. 
Produce as many such rings as you can. 

43. Give an example of a ring with 37 elements which does not have a unity. 

44. Suppose R is a commutative ring in which there exist elements a and b 
such that ab ^ 0; then Jt(R , 2) is not commutative. 

45. Suppose S 1 is a subring of R l and S 2 is a subring of R 2 ; prove that 
Si © S 2 is a subring of i?! © R 2 . 

46. Interpret 66 geometrically ” the ring R © R and the subring Z © Z. 

47. In 2-2-11, we discussed the ring R © R and the three subrings 

R' = {(a, b) e R © R | b = 0}, 

JT = {(*, b)eR®R\a = 0}, 

A = {(< a , b) e R © R | a = 6}, 

each of which is essentially the same as R. What are R' n R\ R' n A, 
R" n A? Using the notation + introduced in Problem 18, what are 
R' + R\ R ' + A, R” + A? 

48. (/) We observed in 2-2-11 one way in which R (n) could be viewed as a 

subring of R (oo \ Find other ways of viewing R (n) as a subring of jR ( oo) . 
(ii) Find other subrings of R ioo) . 
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49. (/) Suppose for each i = 1, 2, 3,... we are given a ring R t . The product 
set of all these rings, R t x R 2 x • • • x R n x • • •, consists of all 
infinite sequences (a i ,a 2 ,a 3 ,...,a n ,...) where a t e R t for each i. 
When addition and multiplication are defined componentwise this 
becomes a-ring—called the direct product of the R/s and denoted by 

IIA R . t 

Consider the set of all elements (a l9 a 2 , a 3 ,...) of II^ R t which 
have only a finite number of nonzero components a t (in other words, 
— 0 for almost all /). This set is a subring of R t , which is called 
the direct sum of the P £ ’s and denoted by ® R t . 

Note that if this procedure is applied to a finite number of rings 
R u R 2 ,..., R n , then the direct product R t and the direct sum 
® R t are the same. 

(//) Suppose each of the rings R i9 i = 1, 2, 3,... is the same ring R , and 
we let X = {1, 2, 3,...}. Then the direct product ring n 0 ? R { is 
essentially the same as Map(2f, R). How would you interpret the 
direct sum I® © R t as a subring of Map(A", R)1 

2-3. Ordered and Well-Ordered Domains 

The definitions of ring and integral domain involved isolating certain 
properties of Z—namely, properties concerning addition and multiplication— 
and using them as “ axioms” for a general system. However, there is much 
more going on in Z. For example, in Z we have such notions as: the size of 
an integer, comparing two integers as to size (that is, which is bigger), or 
ordering the integers according to their size. We shall carry such considera¬ 
tions over to a domain D by axiomatizing the notion of order in a rather 
indirect fashion—that is, by focusing on the set of “positive elements.” 

2-3-1. Definition. An integral domain D is said to be an ordered domain 
when there exists a nonempty subset P of D , called the set of positive elements, 
which satisfies the following conditions: 

(/) a, b e P => a + b e P; in words: P is closed under addition. 

(ii) a,beP=>abeP; in words: P is closed under multiplication. 

(iii) If we write — P = { — a\aeP} (the set of additive inverses of the 
elements of P), and call it the set of negative elements, then D is a 
disjoint union 

D = P u {0} u - P . 

According to the definition, an element of P is said to be positive, and an 
element of — P is said to be negative. Because D is a disjoint union of the three 
set P, {0}, — P we note that the element 0 is neither positive nor negative. 
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Condition (7/7), which may be referred to as the trichotomy law, says that 
for each ae D exactly one of the following choices holds 

aeP, a — 0, ae —P 

—in other words, a is positive, or a is 0, or a is negative. 

We observe that for be D, 

b e —P o —be P 


or in words, 


b is negative <=> — b is positive 

since be —P o b = —a for some aeP o —b =a for some aeP o 
— beP. In addition, replacing b by — b we conclude that 

— be—Po beP 


or in words, 


b is positive o —b is negative. 

2-3-2. Examples. Consider the integral domains Z, Q, or R. In each case, 
let P be the set of all elements which are greater than 0—thus, P is what we 
are accustomed to calling the “set of all positive elements.” Clearly, the re¬ 
quirements of the definition 2-3-1 are properties which we “know” for 
integers, rational numbers, or real numbers. Consequently, Z, Q, and R are 
ordered domains. 

On the other hand, the domain Z 2 = {0, 1} cannot be made into an 
ordered domain. Obviously, there is no way to choose P so that the trichot¬ 
omy law holds. 

Moreover, the integral domain C cannot be made into an ordered domain. 
For suppose C is an ordered domain with set of positive elements P. Since 
i — V~ 1 0, the trichotomy law says that ieP or ie —P. If /e P, then, 

using condition (//), — 1 = i 2 eP. If i e— p, then —ieP and, by condition («), 
— 1 =(-/) 2 eP. Thus, in either case, — 1 e P, which implies 1 e —P. On the 
other hand, — 1 e P implies 1 = (— l) 2 e P . We have, therefore, lePn —P, 
which contradicts the fact that the sets P and — P are disjoint. 

2-3-3. Definition. Suppose D is an ordered domain with P the set of 
positive elements. We shall express this situation in more concise fashion by 
saying that “ {Z>, P} is an ordered domain.” For a, be D we say that a is less 
than b , and write this symbolically as a < b or as b > a, when b — aeP. In 
such circumstances, following the usual terminology, we also say that a is 
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smaller than b , b is greater than a , or b is bigger than a. Thus, the symbol “ < ” 
is to be read as “less than” and the symbol “ > ” is to be read as “greater 
than.” 

According to the definition 

a <b o b — aeP o b — a is positive. 

So by taking a = 0 and changing b to a , or by taking b = 0, we have 
a>0oaePoais positive, 

(*) 

a <0 o —aePoae —P o a is negative. 

Consequently, the notation < (or >) and terminology regarding positive and 
negative elements conform to our expectations based on our experience with 
real numbers. In other words, things turn out exactly the way they should! 

2-3-4. Remark. The requirements for an ordered domain as given in the 
definition 2-3-1, may be translated to the language and notation of “ less than, 
<,” or “greater than, >.” They become: 

(/) a > 0, b > 0 => a + b > 0. 

(ii) a > 0, b > 0 => ab > 0. 

(///) For any a e D exactly one of the following holds: 

a > 0, a = 0, a < 0. 

Our next result provides additional properties of the relation < in an 
ordered domain. Even more important, we show how these properties of < 
can be used as an alternative set of axioms for making an integral domain into 
an ordered domain. 


2-3-5. Theorem. Suppose D is an integral domain. Then D is an ordered 
domain o there exists a relation < on D with the properties: 

(1) a < b, b < c=> a < c; we say that the relation < is transitive. 

(2) a < b => a + c < b + c for all c e D ; we say that the relation “ < ” 

is additive. 

(3) a<b,c>0=>ac<bc;we say that the relation < is multiplica¬ 
tive. 

(4) For any a, be D exactly one of the following holds: 

a <b, a = b, b < a 

—we say that the relation < satisfies the trichotomy law. 
(Obviously, this result may be restated in terms of a relation >.) 
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Proof : =>. To prove this part, we suppose {D,P} is an ordered domain 
and define < (and >) as before, in 2-3-3; we must then prove properties 

0H4). 

The proof of (1) consists of 

a <b, b < c => b — a eP, c— beP 

=> c — a = (c — b) + {b — a) e P 
=> a < c. 

In similar fashion, for any c e D we have 

a < b => b — aeP 

=> (b + c) — {a -F c) — b — aeP 
=>a+c<b + c 


which proves (2). 

The proof of (3) consists of 

a <b, c>0=>b — aeP, ceP 

=>bc — ac = (b — a)c e P 
=>ac < be. 

Finally, for a,beD exactly one of the following holds 

b — aeP , b — a — 0, b — ae—P. 

In virtue of the definition of <, and because b — ae—P o a — beP, these 
three possibilities translate to 

a <b 9 a — b , b <a 

—so (4) holds, and this part of the proof is complete. 

<=. To prove this part, we suppose D is an integral domain with relation 
< satisfying properties (l)-(4); we must show that there is a subset P of D for 
which {Z>, P} is an ordered domain. To do this, let us put 

P = {a e D 10 < a). 

It remains to verify the three conditions of the definition 2-3-1 of ordered 
domain for this P. 

(i) If a e P and b e P, then 0 < a and 0 < b. Since 0 < a we have by 
property (2), 0 + b < a + b. Combining b < a + b with 0 < b we have by 
transitivity [property (1)] 0 < a + b —so a + beP. This proves (/). 

(//) If a e P and beP, then 0 < a and 0 <b. Applying property (3), we 
multiply through in 0 < a by the element b > 0, to obtain 0 = 06 < ab —so 
ab e P. This proves («). 
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(Hi) Given ae D, we apply property (4) for this a and b = 0. The result 
is that exactly one of the following holds 

a < 0, a = 0, 0 < a. 

Now, adding —a to both sides of the case a < 0, we have according to (2), 
0 = tf+( — tf)<0 + ( — a) = — a . Thus, the three distinct possibilities are 

0 < — a, a = 0, 0 < a, 

and these become 

— aeP , <2 = 0, <2 e P. 

Defining —P in the customary manner as { — b | b e P} we have exactly one of 
the following 

ae — P, <2 = 0, aeP. 

This proves (///) and completes the proof. | 

It is worthwhile for the reader to examine both parts of the preceding proof 
and see, in each case, where and how all the hypotheses are used. 

The fact that there are two distinct but equivalent ways to axiomatize an 
ordered domain is not of crucial importance for us. We wish to derive addi¬ 
tional properties of ordered domains, over and above the properties of the set 
P of positive elements and the relation < which are already in hand. 


2-3-6. Proposition. In an ordered domain {Z>, P} we have: 

(1) If a # 0, then a 2 > 0; in words, the square of any nonzero element 
is positive. 

(2) e>0 and — e<0; in words, the identity e is positive, and its 
additive inverse — e is negative. 

(3) If c < 0, then a < b o cb < ca. 

(4) a<b o -b < - a . 

(5) If a < 0 and b > 0, then ab < 0; in words, the product of a positive 
element and a negative element is negative. 

Proof : (1) If a # 0, then aeP or ae —P, but not both. If aeP, then 
a 2 = a • a e P, so < 2 2 > 0. If ae —P, then —aeP and a 2 = ( — a) • ( — a) e P, 
so a 2 > 0. This takes care of the two possibilities and shows that the square of 
a nonzero element is positive. 

(2) According to the definition of integral domain e # 0, so by the pre¬ 
ceding e = e 2 is greater than 0. Thus eeP, which implies — ee —P; this 
means (see (*), for example) that — e < 0. 
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(3) =>. This part is straightforward—namely 

c < 0, a<b^>—ceP 9 b — aeP 

=>ca — cb = (— c)(b — a) eP 
=>cb < ca. 

<=. The proof of this part is less direct. We are given c < 0 and cb < ca . 
Concerning a and b there are three distinct possibilities: a <b, a— b,b < a. 
If a = b, then ca = cb , which excludes the possibility that a = b. If b < a, then, 
using the first part of this proof, we have ca <cb. This contradicts the hy¬ 
pothesis cb < ca and excludes the possibility that b < a. The only remaining 
possibility is a < b. 

(4) By part (2), — e < 0, and we simply apply (3) with c = — e. 

(5) Applying part (3) to a < 0 and 0 < b gives ab < aO = 0. | 

Of course, many other “ natural ” properties of < hold. Some additional 
ones are listed in the problems. 

2-3-7. Remark. Consider an element a ^ 0 in the ordered domain {D , P}. 
If aeP, then —ae —P, and we have — a <0 < a. On the other hand, if 
a e — P, then — aeP, and we have a < 0 < — a. In either case, looking at a 
and — a , we see that one of these elements is positive and the other is negative. 
In particular, a and —a are distinct—that is, a # —a —so the set {< a , —a} has 
two elements. We may now define \a\ 9 the absolute value of a , for any ae D, as 
follows 


M 


o, 

the positive member of the set {< a , —a), 


if a = 0, 
if a # 0. 


(#) 


If a # 0, then, according to the definition, \a\ is a positive element of D; 
it follows, therefore, that for ae D, 


jaj = 0 <=> a = 0. 

The definition of absolute value may be rephrased as 

\a\ = the biggest element of {< a , —a} for all ae D. 

In fact, if a ^ 0, then \a\ is the positive element of { a , — a }, and the positive 
element is surely the bigger one. Even more, this version of the definition is 
applicable for a = 0 also—since then —a — a = 0, and 0 may still be con¬ 
sidered as the biggest element of { a , — a}. 

Instead of the word “ biggest, ”it is, in general, common to use the word 
maximum—and consequently we write 

\a\ = max{a, — a) for all ae D. 


(##) 
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For any ae D (including the possibility a — 0) the set {a y —a] and the set 
{ — a , —(—a)} are identical—meaning that they have the same elements—so 
in virtue of (# #), we have 


\ — a\ = \a\ for all ae D. 

There is still another way to state the definition of absolute value. We write, 
a < b or b > a (and read these in the standard way) to mean a < b or a = b, 
and observe that the basic properties of < carry over to <. Then clearly 


\a\ 



if a > 0, 
if a < 0. 


(###) 


The appearance of the situation a = 0 twice in this formulation causes no 
difficulty. We then have a = — a = 0 and \a\ = 0 in both cases, so everything is 
consistent. 

It is often convenient to write 


M = ± a 

(the right side to be read as “plus or minus a”). The meaning is simply that 
\a\ is a or —a; we do not know which one, and often do not really care. Of 
course, this notation includes |0| = ±0. 

We turn to several of the basic properties of absolute value. 


2-3-8. Proposition. For any a , b in the ordered domain {D , P}, we have: 

(1) -\a\<a< \a\. 

(2) lib > 0 and — b <a<b , then \a\ < b . 

(3) \ab\ = \a\ |6|; in words, the absolute value of a product is the prod¬ 
uct of the absolute values. 

(4) \a + b\ < \a\ + |&|; in words, the absolute value of a sum is less 
than or equal the sum of the absolute values. This is known as the 

triangle inequality. 


Proof : (1) For any ae D we have clearly a < ma x{a, —a} = \a\ . Applying 
this to the element —a gives — a < \ —a\ = \a\. The latter relation may be 
multiplied by — e [that is, we apply 2-3-6, part (4), which carries over to the 
relation <] to obtain — \a\ < a . Consequently, — \a\ < a < \a\. 

(2) Given —b<a<b we multiply through by —e, as above, to obtain 
— b < —a< —(—b) = b. Thus, both a and —a are less than or equal to b ; 
hence \a\ = max{a, —a) < b. 

(3) We know that \a\ = ±a and |&| = ±b, so according to the rules for 
multiplication with + and — (see 2-1-17), it follows that \a\ \b\ = ±ab . We 
do not know if +ab or —ab is the proper choice, but we do know that the 
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proper choice is ma x{ab, — ab }. But this is precisely \ab\. Hence, 

\a\ • |6| = ±ab = meLx{ab , — ab} = \ab\. 

(4) We start from — \a\ < a < \a\ and — \b\ < b < |Z>|. It is easy to see that 
adding yields 

-Qa\ + \b\)<a + b<(\a\ + \b\) 

[see, for example, Problem 4, part (xi) at the end of this section]. Since 
\a\ + |&| > 0 we may apply part (2) above, and conclude that \a -f b\ < 

\a\ + \b\. | 

This completes our discussion of what happens when, in analogy with Z, 
the notion of order is introduced in an integral domain. However, there is 
one more property of the integers which we shall transfer to integral domains 
—namely, the commonly accepted fact (which was used several times in 
Chapter I) that any nonempty collection of positive integers has a smallest 
element. 

2-3-9. Definition. In the ordered domain {Z), P}, the set of positive ele¬ 
ments P is said to be well ordered when every nonempty subset of P has a 
smallest element, and in this case (to keep the terminology brief) {Z>, P} is said 

to be well ordered. 

Of course, by a smallest element of the nonempty subset S'of P we mean an 
element a e S such that a < b for all be S. Note that if a smallest element of 
S exists, then it is unique. Indeed, if a' e S is also a smallest element of 5, then 
we have a < a f and a! < a . This implies a — a !—for if a # a\ then a < a! and 
a* < tf, so by transitivity a < a, contradiction. 

2-3-10. Examples. The most obvious example of a well-ordered domain is 
{ Z, P} where P is the customary set of all positive integers. This is no surprise; 
after all, the requirements for a well-ordered domain were chosen precisely 
because they are satisfied in Z. However, as a preview of coming attractions it 
may be remarked that it is extremely hard to find additional examples of well- 
ordered domains. Thus, the rational numbers Q are an ordered domain when 
P is taken as the set of all positive rational numbers. However, { Q, P} is not 
well ordered—since if we take for S the set P itself, then S has no smallest 
element because, as is well known, there is no smallest positive rational 
number. 

We turn now to an investigation of integer-like properties which hold in 
any well-ordered domain. 
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2-3-11. Theorem. If {£), P} is a well-ordered domain with unity element e, 
then there are no elements between 0 and e ; in other words, e is the smallest 
positive element. 


Proof : Recording to 2-3-6, we have 0 < e. Then, as expected, by the 
phrase “ a is an element between 0 and e ” we mean that 0 < a < e. (Note that 
the meaning of 66 between ” excludes the possibilities a = 0 or a = e.) Of 
course, such an a is automatically an element of P. 

Turning to the proof, let us suppose the theorem is false. So there exists 
an element between 0 and e . Consequently, if we put 

S={<zeD|0<tf<c} 

(in words, S is the set of all elements between 0 and e ), then S is a nonempty 
subset of P. By well ordering, 5 has a smallest element—call it c. We have then 
0 < c < e and c > 0, so because < is multiplicative (see 2-3-5) it follows that 

c-Occ-ccc-e. 

Simplifying (and using c < e) we have 

0 < c 2 < c < c. 

This tells us that c 2 is an element of S which is smaller than c —contradicting 
the choice of c as the smallest element of S. Thus, our initial assumption that 
the theorem is false leads to a contradiction. Hence, the theorem is true. 

Clearly, e is the smallest element of P —for if a e P is smaller than e , then 
0 < a < e, and according to the foregoing this is impossible. | 


2-3-12. Corollary. If {£>, T 5 } is a well-ordered domain, then, for any a e Z), 
there are no elements between a and a + e. 


Proof: Since e > 0 we know that a < a + e. Now, suppose b is an element 
of D which lies between a and a + e — 

a < b < a + e. 

Adding — a to each of these terms we obtain, according to 2-3-5, 
a — a < b — a < (a + e) — a 

so that 


0 < b — a < c. 

Thus, b — a is an element between 0 and e t which contradicts 2-3-11—so the 
hypothesized element b which lies between a and a + e cannot exist. | 
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In virtue of this result, if a is any element of a well-ordered domain, it is 
appropriate to refer to a + e as the “ next ” element, and also to call a and 
a -f e “consecutive” elements. What happens if one keeps passing from an 
element to the next element over and over? Do things turn out as in Z? The 
answer will appear shortly. 

2-3-13. Definition. Suppose {D, P} is a well-ordered domain. A subset 
S of D is said to be inductive when it satisfies the condition 

a e S=>a + e e S. 

The definition of an inductive set requires that for any asS the next 
element a + e must also belong to S. In other words, we might say (carelessly, 
perhaps) that 5 is closed under the operation of adding the unity element c , 
or that S is closed under “passage to the next element.” 

As examples of inductive sets in the well-ordered domain { Z, />}, we may 
list S = {n s Z | n > 0}, or S = {n e Z | n > 5}, or S = {/? e Z | n > — 3}. In 
fact, the reader may easily convince himself that a subset S of Z is inductive o 
S = Z or S is of form {ns Z | n > m } for some fixed m s Z. Unfortunately, 
because we have given no examples of well-ordered domains other than 
{ Z, P} we are in no positition to give additional examples of inductive sets. 

The use of the name “inductive” for the property under consideration is 
suggestive of something quite familiar for the integers—namely, mathematical 
induction. The justification for our choice of this name appears in the next 
result and its consequences. 


2-3-14. Theorem (Mathematical Induction). Suppose { D , P } is a well- 
ordered domain. If S is a subset of P which satisfies the conditions 

(/) e e S, (//) S is inductive, 

then S = P. In other words, an inductive set of positive elements which 
contains the identity e must be P itself. In particular, this theorem is 
applicable for the well-ordered domain Z. 


Proof: Suppose the theorem is false—so S ^ P. Let us put 

T=P~S 

(by which is meant, as usual, that T consists of those elements of P which do 
not belong to S) so, because S ^ P, T is a nonempty subset of P. Clearly, S and 
T are disjoint sets whose union is P. 
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Because we are in a well-ordered domain, the nonempty subset T of P 
has a smallest element, call it /. Now, t $ S (since t e T and the sets S and T 
are disjoint) and hence t i=- e (because e e S). Furthermore, since t e P, e is the 
smallest element of and I ^ e/it follows that t > e; and then t — e > 0. We 
observe that t — e$T (because t is the smallest element of T , and t — e < t) 
and consequently t — e e S. But S is inductive—so t = (t — e) + e e S, 
contradiction. This completes the proof. | 

The preceding formulation of mathematical induction may seem some¬ 
what strange, but we can derive the more familiar versions from it in short 
order. 


2-3-15. Corollary ( Mathematical Induction). Suppose {£), P } is a well- 
ordered domain and for every ^ePwe have a statement (that is, assertion 
or proposition) n(a) which is cither true or false. If the following two 
conditions are satisfied: 

(/) 7 z(e) is true; 

(//) for any ae P, n(a) is true implies that n(a + e) is true, 

—then n(a) is true for all a e P. In particular, this result applies for the 
well-ordered domain Z. 


Proof: Let S be the set of all positive elements a for which n(a) is true: 

S = {a e P | n(a) is true}. 

Condition (/) guarantees that e e S. Using condition (//) we have 

ae S=> n(a) is true => n(a + e) is true => a + e e 5, 

so that S is inductive. Hence, according to 2-3-14, S — P, and n(a) is true for 
all a e P. | 


2-3-16. Corollary ( Mathematical Induction). Suppose { D , P} is a well- 
ordered domain and for every ae P we have a statement n(a) which is either 
true or false. If the following condition is satisfied: 

(/) For any ae P, the truth of n(b) for all b e P satisfying b < a implies 
that n(a) is true. 

—then n(a) is true for all a e P. In particular, this result applies for the well- 
ordered domain Z. 
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Proof : Suppose the corollary is false; so there exists an aeP for which 
n(a) is false. Therefore, the set 

T = {a e P \ n(a) is false} 

is a nonempty subset of P , and by well ordering Thas a smallest element—call 
if c. The choice of c implies that n(b) is true for all b e P with b < c. But now 
condition (/) tells us that n(c) is true—contradiction. This completes the 
proof. | 

2-3-17. Remark. The three preceding results are of interest to us primarily 
in the case of the well-ordered domain Z. For this reason we restate these 
versions of mathematical induction explicitly for integers in the more familiar 
forms. 

(1) Suppose S is a set of positive integers such that 

(/) 1 e S , (ii) n e S=>n + 1 e S, 

—then S is the set of all positive integers. 

(2) Suppose that for each positive integer n we have a statement n{n) 
which is either true or false. If 

(/) 7r(l) is true, 

(ii) for any n > 0, n(ri) is true=^> n(n + 1) is true 
—then n(n) is true for all n > 0. 

(3) Suppose that for each positive integer n we have a statement n(n) that 
is either true or false. Suppose 

(7) for any n > 0, n(m) is true for all m satisfying 0 < m < n implies 
that n(n) is true 

—then n(n) is true for all n > 0. 


A word of caution is necessary in connection with the use of (3) to prove 
something by mathematical induction. On the face of it, the use of (3) appears 
simpler than the use of (2), because (3) involves the verification of only one 
condition whereas (2) involves two conditions. However, the appearance is 
deceptive. More precisely, consider the verification of condition (/) of (3) for 
the case n = 1. The set of all m satisfying 0 < m < 1 is empty, so the hypothesis 
part of condition (/), which says that n(m) is true for all m satisfying 0 < m < 1, 
is vacuously true. Hence, in order to verify condition (/) in the case of n = 1, 
we must show that this vacuously true hypothesis implies the truth of 7r(l). 
Thus, a direct proof (without any inductive assumptions) of the truth of 7r(l) 
must be given. This means that things turn out as in (2) where a direct proof 
of the truth of n(l) is required. 



150 


II. RINGS AND DOMAINS 


2-3-18. Example. To illustrate a proof by induction, let us prove the 
formula, in Z, 


y 2 _ n(n + 1 )(2/i + 1) 


for all n > 1. 


(*) 


This is, of course, an infinite number of formulas, one for each positive integer 
n . As is well known, the left-hand side of (*) is just a shorthand notation 
for the sum of the squares of the first n integers, so we are concerned with 
proving 


l 2 + 2 2 + 3 2 + • • • + n 2 


n(n + 1)(2 n + 1) 
6 


for all n > 1. 


We are not concerned here with how the expression on the right-hand side, 
n(n -f 1)(2 n -f l)/6, was derived; our sole interest is in verifying that the for¬ 
mula 66 works” for all n > 1. 

Naturally, it is physically impossible to verify the formula for each n 
individually; instead we use the version of mathematical induction given in 
(1) of 2-3-17. Let 

S = {n > 01 (*) is true for n}. 

In other words, S is the set of all positive integers n for which the formula (*) 
holds. 

To prove that 1 e S, we need to check the validity of the formula (*) when 
n = 1. But this is trivial, since for n = 1 


l 2 + 2 2 + • • • + n 2 equals 1 and 


n(n + 1)(2 n + 1) 
6 


equals 


1-2-3 

~~ 6 ~ 


= 1 


It remains to prove that S is inductive—or what is the same, ne S=> 
n + 1 e S. Thus, we assume (inductively) that (*) is true for a fixed n —so for 
this n we know 


^ . 2 _ n(n + 1)(2 n + 1) 

L 1 ~ S 9 

i= 1 o 

—and we seek to prove the validity of (*) for« + 1. Of course, the right side of 
(*) for n + 1 is obtained by replacing n by n + 1 in the right side of the expres¬ 
sion (*), so we must prove 

n y . 2 _ (n + 1 )(n + 2)(2 n + 3) 

A 1 ~ 2 • 


But this is straightforward; in fact, overdoing the details, we have 
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n+ 1 


I 

i = 1 


= l 2 + 2 2 + 3 2 + • • • + (n + l) 2 
= (l 2 + 2 2 + * * * + n 2 ) + (w + l ) 2 

= (.ij + («+1) 2 

+ l)(2/i + 1) 


+ (w + l) 2 (since n e S) 


(n + 1) 


p(2w + 1) + 6 (/i + l) j 


(n + !)(« + 2)(2n + 3) 


Now, according to version (1) of mathematical induction, S is the set of all 
positive integers—so (*) is true for all n > 0. 

It is also possible to prove the preceding result by using version (2) of 
mathematical induction. The details of the proof are very much like the fore¬ 
going, so we content ourselves with a sketch. For each n > 0, let n(n) be the 
statement: 

the formula (*) holds for the n under consideration. 

As before, 7i(l) is clearly true. Next, suppose inductively that n(n) is true, so 

y . 2 _ n(n + 1)(2« + 1) 

for the n in question. Using this one evaluates i 2 , and just as before this 
becomes (n + \){n + 2)(2 n + 3)/6; so it follows that n(n + 1) is true. Conse¬ 
quently, according to version (2) of mathematical induction, n(n) is true for 
all n > 0. This completes the proof. | 


2-3-19. Examples. Consider the infinite sequence of integers w 1? u 2 , w 3 ,..., 
u n ,... defined as follows 

= 1, w 2 = 1, u 3 = 2, u 4 = 3, u 5 = 5, w 6 =8,... 

and such that for n>2,u n + l is defined recursively by the rule 

= u n + w„_!. 

In other words, each term, starting with the third, is the sum of the two 
preceding terms; so the sequence looks like 

1, 1,2, 3, 5, 8, 13,21,34, 55, 89, 144, 233, 377,610,.... 

It is sometimes convenient to put u 0 = 0. This does not harm the recursion, 
since clearly u 2 = + u 0 . 
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This sequence is commonly known as the Fibonacci sequence. It has a 
number of interesting properties which may be proved by induction. We first 
list several such properties, and then prove (1), (2), (4), (9), and (10). 


(1) 

(«»> «„+ 1) = 1> 

for all n > 1 , 

(2) 

1<(^) » 

for all n > 1 , 

(3) 

n 

Z W i = M « + 2 - 1» 

for all 7i > 1 , 


i = 1 


(4) If we put a = * and b = 


1 -y/5 


, then 



a n -b n 

Un ~ Vs 

for all n > 1 , 

(5) 

n 

Yj U 2i-l — u 2n 9 
i = 1 

for all n > 1 , 

(6) 

n 

Y U 2i ~ U 2n+ 1 “ 1 ? 
i=l 

for all n > 1 , 

(7) 

i = 1 

for all n > 1 , 

(8) 

ft 

Z u i 2 = “»«»+1. 

1 = 1 

2n - 1 

Z «(«.•+i = «L. 

i= 1 

for all n > 1 , 

(9) 

for all n > 1 , 

(10) 

W m^ll+l = + W m+1 W„ +1 , 

for all m > 0, n > 0 , 

(11) 

W n 1 . 

for all n > 1 , 

(12) 


for all k > 1 , n > 1 , 

(13) 


for all 7z > 1 [a as in (4)] 


Proof of (1): For each n > 1, let n(ri) be the statement 
(w„, u n + 1 ) is equal to 1. 

In other words, n(n) is the assertion that u n and u n+i are relatively prime. In 
the first place, n(l) is true because {u u u 2 ) = (1, 1) = 1. Now, suppose 
inductively that n(ri) is true for some given n —so (w w , w„ +1 ) = 1. We must 
prove the truth of 7 z(n + 1)—so we must show that (u n + 1 , u n + 2 ) = 1. To do 
this, suppose d is any positive divisor of {u n+u u n + 2 )\ so d divides both u n + 1 
and u n+2 — w w+ i + u n . It follows that d\u n , and consequently d divides 
(w„, u n+i ) = 1. Therefore, d— 1, and then (w rt+1 , u n + 1 ) = 1—so n{n + 1) is 
indeed true. Hence, by mathematical induction, n(n) is true for all n 9 which 
completes the proof that any two consecutive terms of the Fibonacci sequence 
are relatively prime. | 
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Proof of (2): For each n > 1, let n(n) be the statement 
u n + 1 is less than (7/4)". 

For n = 1 we have u 2 = 1 and (7/4) 1 =7/4, so 7r(l) is true. For n — 2 we have 
w 3 = 2and(7/4) 2 = 49/16,so7r(2)istrue. Now, let us apply version (3)of mathe¬ 
matical induction (see 2-3-17). We must show 

for any « > 0, . . 

n(m) is true for all m satisfying 0 < m < n implies n(n) is true. ' ' 

This implication is surely valid when n = 1; for in this case, we have already 
shown directly that 7r(l) is true—so it is of no consequence that there are no 
integers m for which 0 cm <n — 1. Furthermore, the implication (#) is 
valid when n = 2. In this case, we have already shown directly that n(2) is 
true, so it does not really matter if n(m) is true for all m satisfying 0 < m < n = 
2—in other words, it does not really matter if 7r(l) is true—the implication 
( #) would be valid even if 7r(l) were false. (The argument used here is a fact of 
logic, and the reader should make every effort to understand it thoroughly.) 
Finally, let us show the validity of the implication (#) when n > 3. In this 
situation, we have by the hypothesis of (#), u m+l < (7/4) m for m — 1,2,..., 
n — 1, and then 


+ w„-i < 


r ,+ (r-(r(H 


(*) 


—so n(n) is true. Thus, the implication (#) is valid for all n > 0, so n(n) is true 
for all n > 0 and the proof is complete. | 

In connection with this proof, it should be noted that a proof based on 
version (2) of mathematical induction (see 2-3-17) cannot be given. Because a 
term of the Fibonacci sequence depends on the two terms immediately pre¬ 
ceding it, an induction which tries to pass from information about a single 
term of the sequence to the same kind of information about the next term 
cannot work. In particular, it is for this reason that we proved the truth of 
n(n) for both n — 1 and 2 at the start—then an induction based on version (3) 
of mathematical induction could proceed 


Proof of (4): Here again we use version (3) of mathematical induction. 
A brief sketch will suffice. One verifies in a straightforward manner that the 
relation 

a n — b n 

V5 
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is true for n = 1 and n = 2. Then under the inductive assumption that 
a m — b m 

u m =- j= — for m = 1, 2, 3, • • •, n — 1 (where n>3) 

V 5 

one obtains (using a 2 = a + 1 and b 2 = + 1) 

a"" 1 - b n ~ 1 a n ~ 2 -b n ~ 2 

u n - + u„_ 2 - ^ 

a n - 2 (a + l)-b n - 2 (b+l) 

V 5 

(a n - 2 )(a 2 ) - (b n ~ 2 )(b 2 ) 

V 5 

a n -b n 


Consequently, the desired relation is true for all n > 1. | 


Proof of (9): We give a compact, but complete, proof. Let 

{ 2n- 1 

H ^ 1 Z W i M f+l = U 2n 

First of all, 1 e S , because in this case Ifif 1 + 1 is u i u 2 while u\ n is u\ 
and u 1 u 2 = 1 = u\. Then, suppose inductively that n e S —so 1 w*w l+1 = 
we seek to show that n + 1 e 5—since 2(« + 1) — 1 = 2« + 1 this means 
we want 

2n+ 1 

Z W i W i+ 1 = W 2(«+ 1) • 

i= 1 


To do this, we compute 

2n+ 1 2n— 1 

Z W i M i+l = Z 1 + U 2n U 2n+ 1 + U 2n+ l U 2n + 2 

i = 1 i= 1 

= W2n + W 2n W 2n+l + W 2n+ l W 2u + 2 
= U 2n( U 2n + U 2n+l) + U 2n+l U 2n+2 


= u 2n u 2n + 2 + U 2n+l U 2n+2 

— ( U 2n + U 2n+ l) u 2n + 2 

— U 2n + 2 • 
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Thus, S is inductive and it follows that S consists of all positive integers; so 
the given formula holds for all n > 1. | 

Proof of (10): Our objective is to prove 

u m + n + l = u m U n + U m + l U n + l (*) 

for all m > 0, n > 0. Since both m and n vary there is a “ double infinity ” of 
choices for the pair of indices m and w, so some kind of 66 double induction ” is 
required. 

With each pair of integers rn > 0, n > 0 let us associate the point (m, n ) in 
the X- Y plane. Plotting all such integral points, we have the accompanying 
picture. 


Y 


x 


For each n > 0, let n(ri) be the statement 

the relation (*) holds for this n and all m > 0. 

In other words, n(ri) asserts that (*) holds for all integral points in row n of the 
picture (where the rows are naturally numbered, from the bottom upwards, 
by 0, 1, 2, 3,...). The idea of our proof, then, is to apply induction for the 
rows of the picture. 

First of all, 7r(0) is true. To see this, one notes that for n = 0 and all m > 0, 
the relation (*) reads 

w m+ 1 = u m ii 0 + u m+1 u x 
and this is true since u 0 = 0, u t = 1. 
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Furthermore, n(l) is true. In fact, for n = 1 and all m > 0, the relation (*) 
takes the form 


U m + 2 — u m u l + U m+l U 2 


and this is true since u t = 1, u 2 — 1. 

Finally, suppose inductively that n(k) is true for k = 0, 1, 2,..., n (where 
n > 2); we must show that n(n -F 1) is true. In other words, assuming (*) is true 
for rows 0, 1, 2,...,«, we have to show it is also true for row n + l. Now, we 
have 

^m + (n + l) + l ^m + (n + l) "F ^wi + (n + l) —1 

= Um + n + 1 "F Mm + (n — 1) + 1 

= (u m U n + U m + iU n + i) + (u m U n -i + + 

= U m (u n + w„-i) + u m+ i(u n+1 + u n ) 

— u m u n +1 + U m+l u n + 2 


so indeed n(n + 1) is true. Thus, n(n) is true for all «, and the proof is 
complete. | 

This proof, it should be noted, uses a slight—but obviously permissible— 
modification of version (3) of mathematical induction; namely, instead of 
starting from n = 1, we start from n — 0. 


2-3-20 / PROBLEMS 

1. If {Z>, P} is an ordered domain, show that 

(i) a < b o a + c < b + c. 

(ii) a — c < a — d o d < c. 

(iii) a < b=> a ^ b. 

(iv) If c > 0, then ac < be <=> a < b. 

(v) If a < 0 and b < 0, then ab > 0. 

2. In an ordered domain (Z>, P} show that 
(/) If a < b and c <d 9 then a + c < b + d. 

(ii) If 0 < a < b and 0 < c < d, then ac < bd. 

3. Suppose (Z>, P} is an ordered domain, 

(/) Prove that if a < b 9 then a 3 < b 3 . 

(//) What about the converse? 

(iii) Generalize (/) and (ii) for an arbitrary odd power. How about even 
powers ? 
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4. Prove the following properties of < in an ordered domain {D,P}: 

(/) a > 0, b > 0 => a + b > 0. 

(//) a > 0, b > 0 => ab > 0. 

(iii) a<b,b<c^>a<c. 

(iv) a<b o a + c < b + c. 

(v) If c > 0, then a<b=>ac<bc. 

(vi) If c < 0, then a < b => cb < ca. 

(vii) a <b <=> —b < —a. 

(viii) a <0, b <0=>ab >0. 

(ix) a < 0, b > 0 => ab < 0. 

(x) a 2 > 0 for all ae D. 

(x/) a<b, c<d=>a + c<b + d. 

(xii) 0 < a<b, 0 < c < d=>0 < ac < bd. 

5. In the ordered domain { D , P}, prove: 


(0 

a > 

0, b > 0=> a + b > 

0. 

(«) 

a > 

0, b > 0 => ab > 0. 


(iii) 

a < 

d 

V 

it 

V 


(iv) 

If c 

> 0, then a < b o 

ac < be. 

(v) 

If c 

> 0, then a < b=>ac < be. 

(vi) 

If c 

< 0, then a <b o 

be < ac. 

(vii) 

If c 

< 0, then a < b=>bc < ac. 

(viii) 

a < 

0, b < 0 => ab > 0. 


(ix) 

a < 

0, b > 0 => ab < 0. 


(*) 

a < 

0, b > 0 => ab < 0. 


(xi) 

a < 

b, c <d^>a + c < 

b + d. 

(xii) 

o< 

a < b,0 < c < d=> 

0 < ac < bd. 

(xiii) 

0< 

a < b,0 < c < rf=> 

0 < ac< bd. 


6. Show that the equation x 2 + e = 0 cannot be solved in an ordered domain 

{An 

7. For a, b in an ordered domain show that 

a 2 + b 2 ;> lab 

(where lab means ab + ab). When does equality hold ? 

8. If a < b and b < a in an ordered domain {D , P} show that a = b. What 
happens if b cal 

9. For any a, b in the ordered domain {D , P}, prove: 
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10. For any a in the ordered domain {D, P}, is it true that 

— \a\ — min{tf, —a] 

where, of course, “ min ” means the minimum (that is, the smaller) of the 
two elements ? 

11. For elements a , b , c in an ordered domain {Z>, P}, 

(/) min{a, b } + max{a, b} = a + b, 

(ii) min{max{tf, b }, c } = max{min{tf, c }, min {b 9 c}} 9 

(iii) max{min{tf, b} 9 c} = min{max{<z, c}, max{6, c}}, 

(iv) max{a, b , c} + min{a + b, a + c, b + c} = a + b + c, 

(v) min{max{tf, 6}, max{a, c}, max{Z>, c}} 

= max{min{tf, 6}, min{tf, c}, min{Z>, c}}. 

12. Can you make Z> = Z\yJ 3] into an ordered domain ? How about 
D = Z[i\ ? 

13. Prove that 

a 2 — ab + b 2 > 0 


for all a, in the ordered domain (Z>, P}. 


14. Use mathematical induction to prove each of the following formulas for 
all integers greater than or equal to 1. 


(0 Y i — 1 + 2 + • • • + n — 


n(n + 1) 


v *0 + , n ( n + *) n ( n + + 2 ) 

00 L —o— = 1 + 3 + 6 + • • • + —--=- t-> 

i = 1 2. I o 


A 1 111 

il1 i(i + l)“2 + 6 + 12 + 


1 


n(n + 1) n + 1 ’ 

4n 3 — n 


(iv) Y ( 2i - l) 2 = l 2 + 3 2 + 5 2 + * • * + (2n - l) 2 = 

i=i 3 


00 X i 3 = l 3 + 2 3 + 3 3 + • • • + 

i= 1 




15. Find formulas for each of the following sums, and then prove their 
validity by induction 


(0 Y 2i = 2 + 4 + 6 + *** + (2n), 


*= 1 


n> 1. 


(ii) £ (2i — 1) = 1 + 3 + 5 + • • • + (2« — 1), 

i= 1 
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16. For a, d, r # 1 in R, what is the sum of the first n terms of 

(0 the arithmetic progression: a, a + d, a + 2d,..., a + (n — \)d ,..., 

(ii) the geometric progression: a, ar, ar 2 ,..., ar n ~ i , _ 

Prove your assertions by induction. 

17. List several places in Chapter I where some form of mathematical induc¬ 
tion or well ordering is used, explicitly or implicitly, in a proof. 

18. Prove the parts of 2-3-19 which were not proved in the text. 

19. Use induction to prove that 7 divides 5 2n+1 -F 2 2rt+1 for all n > 0. 

20. For any n > 1 prove inductively that a set with n elements has exactly 2" 
subsets (among which are included the empty set and the set itself). 

21. Suppose { D, P } is a well-ordered domain. 

(/) We know that P has a smallest element e ; does P have a biggest 
element? 

(ii) Does every subset of D have a smallest element? 

(iii) Is there a subset of D which is not inductive? 

Justify your answers. 

22. In a well-ordered domain {D, P}, we have: 

(/) a > b o a > b — e. 

(ii) If S is an inductive subset of D which has a smallest element s , then 

S = {ae D\a> s) = {ae D\a> s — e}. 

In virtue of this result, we are justified in starting a proof by mathematical 
induction from any integer n , when the situation calls for it, rather than 
solely from n = 1 (as was stated in 2-3-17). 

23. Induction is used, when feasible, to prove an infinite number of statements. 
What is the idea underlying the inductive approach? 

24. Suppose we have three pegs on one of which there are piled n cir¬ 
cular disks—where each disk is smaller than the one on which it lies. 
Disks may be moved from one peg to another under the following 
restrictions: 

(/) Only one disk may be moved at a time. 

(ii) A disk may not be placed on top of a smaller disk. 

Prove that under these rules, it is possible to transfer the entire pile of 
disks to another peg; moreover, this transfer can be accomplished in 
2 n — 1 moves. 

(///) If the peg to which all the disks are to be moved is specified in ad¬ 
vance, what should the first move be ? 
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2-4. Computation Rules 

The informal phrase “ and so on ” occurs often in mathematical arguments 
to signify that a proof may be completed by repeating, as many times as 
required, the steps of the proof which have already been described. In fact, 
“ and so on ” is usually a cover up for some kind of mathematical induction. 
In this section we shall illustrate such uses of induction by showing, with a 
certain amount of care, how the basic rules for computation which we normal¬ 
ly take for granted have meaning and validity in any ring. 


2-4-1. Generalized Associative Law for Multiplication. Suppose n > 3, and 
let a l9 a 2 ,..., a n be arbitrary elements of the ring R; then the meaning of 

a i a 2 a 3 “ ’ a n 

is unambiguous. In other words, all ways of inserting parentheses in this 
expression, and then carrying out the required multiplications, give the 
same result. 


Proof : The meaning of this result is as follows. According to the definition 
of a ring, we know how to multiply two elements of R , but the definition says 
nothing about multiplying more than two elements. In particular, the symbol 
a x a 2 a 3 • • • a n has no meaning unless we insert parentheses which prescribe the 
order in which one should perform a sequence of multiplications (where it is 
understood that at each step one multiplies exactly two elements of R) until 
all the elements a u a 2 ,..., a n are used up. The assertion here is that no matter 
how the parentheses are placed the end result is always the same. For example, 
when n — 6, this result implies the equality of 

((((a 1 a 2 )a 3 K)a 5 K and (a 1 a 2 )((a 3 a 4 )(a s a 6 )). 

Turning to the proof, we consider first the case n = 3. There are exactly 
two possible ways to insert parentheses—namely, (a\af)a 3 or a i( a 2 a 3 )—and 
according to the associative law both of these determine the same element of 
R. As was noted long ago [see 2-1-2, part (5)], this common value is denoted by 
CL\CL 2 a 3 , and there is no ambiguity. 

To deal with the general case, it is convenient to assign, at the start, a 
specific meaning to the symbol a^a 2 a 3 --- a n . More precisely, let us take 

a t a 2 o 3 •••#„= ((• • •(((«! a 2 )a 3 )a 4 y • X-iK • 

In other words, we start with a t a 2 , then compute (a t a 2 )a 3 , and keep throwing 
in the remaining a f ’s on the right, one at a time. Of course, this definition—as 
indicated by the use of three dots, involves an induction or recursion. In more 
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detail, the definition of a x a 2 a 3 • • • a n for n > 3 is given by 

a i a i a 3 = ( a i a i) a z » 
a 3 ~ ( a l a 2 a 3^ a 4 9 

and then, inductively 

a i a 2 a 3 % " a n-l a n = ( a l a 2 a 3 " ' a n-l) a n • (#) 


We must show that any way of inserting parentheses gives this value. 

Thus having observed the validity of the generalized associative law for 
n = 3, we consider n > 4 and suppose inductively that the generalized associa¬ 
tive law holds for all m < n (that is, for m = 3, 4,...,« — 1). Consider an 
arbitrary, but fixed, choice of parentheses for the multiplication of a u a 2 ,..., 
a n . At the last step in the required sequence of multiplications we have, in 
virtue of the induction hypothesis (which says that in any product of less than 
n terms the result is unambiguous and there is no need for parentheses), 

•••a s )(a i+1 •••«„) (*) 

for some s with 1 — 1. If ^ = — 1, the product (*) takes the form 

(«!*•• #„_!)(#„) which, according to our definition, is equal to a v a 2 • • • a n —so 
the proof in this case is finished. If s < n — 1, then the expression (a s+1 ••• a n ) 
of (*) includes two or more terms a t and we may rewrite (*) as 


a s )(a s+1 


= Oi • 

• • « S )((« S +1 • ' 

' • )(«„)) 

= ((«! 

• • • OOs+i • ■ 

' • ))(«») 

= («1 ’ 

)(«») 


= «1 " 




For completeness, we list the justifications for the various equalities above: 
first, by application of the definition ( #) to the term a s+1 • • • a n ; second, by the 
ordinary associative law; third, by the induction hypothesis; fourth, by the 
definition (#). This completes the proof. | 


In exactly the same way, we have the generalized associative law for addi¬ 
tion—after all, the preceding proof can obviously be applied to any binary 
operation which satisfies the associative law. Thus, the expression 


+ a 2 + • • • + a n 

has an unambiguous meaning. Even more, because addition is commutative 
we have the following more general result. 


2-4-2. Generalized Commutative-Associative Law for Addition. If n > 2 

and a u a 2 , ..., a n are elements of the ring R , then all sums of these n 
elements, in any order, are equal. 
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Proof : According to the generalized associative law for addition the sum 

a \ + #2 + ’ ’ ’ + a n 

is well defined (the result being independent of how the parentheses are 
inserted). By the same token, if we permute, or rearrange, the elements 
a u a 2 j ..., a n and add them in this new order, then here too the sum has 
unambiguous meaning. (For example, when n = 6, one possible rearrange¬ 
ment of a x -F a 2 + a z + a 4 + -F a 6 is a 5 -F a 3 -F a l + a 4 -F a 6 -F a 2 .) We 
shall show, by induction on n , that the sum of any such rearrangement is 
always equal to a t -F a 2 -F • • • + a n . 

For n = 2, there is only one possible rearrangement of a t -F a 2 —namely 
a 2 -F ; and these are equal by virtue of the commutative law for addition. 
Now, suppose inductively that our generalized commutative-associative law 
holds for 2, 3,...,«— 1 and consider the sum of any rearrangement of 
a l9 a 2 ,..., a n . In this sum, a n may appear as the last term, or as the first 
term, or as an in-between term. Thus, depending on the location of a n , the 
sum takes one of the forms 

(/) (*••) + 

(«) iPn) + 0 • 0 , 

(iii) (•••) + (a tt ) + (• • •)• 

(Of course, the generalized associative law for addition permits us to insert 
parentheses in these ways.) Now, applying the commutative law, (//) becomes 
(•••) + (a n ); and applying the commutative law and the generalized associative 
law to (iii) leads to (•••) + (a„). Consequently, in all three cases, the sum is 
equal to 


(•••) + fo). 

But this (• • •) represents the sum of some rearrangement of a l9 a 2 ,..., a n _ 1 ; 
so, by the induction hypothesis, it is equal to a t + ••• + a n _ We obtain, 
therefore, 


(••*) + ( a n) = (tfi + a 2 + • • * + 1) + a n 

which, according to the definition, equals a Y + ••• + a n . t + a n . This com¬ 
pletes the proof. | 

The idea of the preceding proof is quite clear, in spite of the imperfections 
caused by our lack of a suitable notation for the sum of the terms of a permu¬ 
tation or rearrangement of a l9 a 2 ,..., a n . Such a notation will be introduced 
in Chapter IV. 
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At this point, it becomes convenient to write sums and products in more 
compact form, so we introduce the standard £ and IT notations. More pre¬ 
cisely, we write 

n 

Z tf; = + a 2 4- • • • + a„, 

i= 1 
n 

n a,' = ai a 2 •••««. 

i - 1 

and, in particular, in virtue of what has gone before we have 

n in — 1 \ n in — 1 

Z = Z aA + a n , n = n 

i = 1 \i = 1 / i= 1 \i= 1 

2-4-3. Generalized Distributive Laws. If n > 2 and a 9 b u ... ,b n are ele¬ 

ments of the ring R , then 

° (.Z = .Z ( ab i ) and (,Z a = .Z (M) • 

Proof: The first of these is a compact form for the assertion 

a(bi + Z?2 "h * * * "h + * * * + ab n (*) 

and it is an easy consequence of things we already know. In fact, for ft = 2 the 
assertion is surely true—it is precisely the distributive law axiom for a ring. 
Then, suppose inductively that this generalized distributive law (*) holds for 
ft — 1. We have, therefore, 

«(z^)=«("z^+^) 

=«(Z 1 b, 

n - 1 

= Z ( ab d + ab « 

i = 1 

= Z ( ab i ) 

i = 1 

so that (*) holds for n also. Thus, by mathematical induction, this generalized 
distributive law holds for all n > 2. | 

In similar fashion, the other generalized distributive law holds for all 
ft > 2. It says 

( b\ + b 2 + * * * + b^)a = b\(x + b 2 ci + • • • + b n a. 
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2-4-4. Proposition. If a u a 2 , ..., a m , b x , b 2 ,..., b n are elements of the 
ring R , then 




Proof: The left-hand side of this equation is the product. 

(tfj + a 2 + • • * + 0(^1 + b 2 + * • • + b n ) 

while the right-hand side is the sum of all the terms in the rectangular array 
(with m rows and n columns) 

a x b 2 ••• 

Oih a 2 b 2 ••• a 2 b„ 


o m b\ a m b 2 ••• a m b n . 

According to the generalized commutative-associative law, 2-4-2, the order in 
which the terms of the rectangular array are added does not matter, and this 
is in accord with the notation 

X («!&;), 

i = 1. m 

j = 1, ... ,» 

which does not specify in what order the terms are to be added. The proof we 
are about to give amounts to adding one row at a time. Namely, in virtue of 
the two generalized distributive laws, (both of which are used) 

(.X (.X b jj = Ol + °2 + • • • + a rn) (.X bjj 

= a i (.X + «a (.X b jj + ’ ’' + a m (.X b jj 

= Z ( a ibj) + Z ( a 2bj) + * * * + Z ( a mbp 

J=1 j** 1 j= 1 
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2-4-5. Remark. Consider the product n" = t a x of n elements of the ring R. 
In the special case where all the a t are equal—say, a = a x = a 2 = • • • = a n —it 
is customary to write cf for n” = t a { . So according to the meaning of the 
product, n, we have 

a 1 =a, a 2 = (a x )(a), ..., a" = (a n ~ x )a ,.... 


As a matter of fact, precisely these rules may be used to give a direct, recursive, 
definition of a n for all n > 1 (in other words, each power of a , after the first, 
is obtained by multiplying the preceding power of a , on the right, by a). Of 
course, one should still think of a n as the product of n copies of a. 

From the generalized associative law for multiplication we obtain the 
“familiar” rules 


(/) (a m )(a n ) = a m+n 9 
0 ii ) (a m ) n = a mtt , 


for all m, n > 1. 


In fact, a m+n equals aa-- - a, where there are m + n terms, and by 2-4-1, we 

m + n 

may insert parentheses to obtain 

aa-- - a = (a ••• a){a ••• a) 

m+n m n 


where the first parenthesis contains m terms (each of which is a) and the second 
parenthesis contains n terms; and the right side surely equals (< cT){a n ). This 
proves (0- 

The proof of (ii) consists of grouping the mn copies of a (which make up 
a mn ), via parentheses, into n groups each of which has m copies of a. In detail, 


a mn = aa-- - aa 


mn 


n 


= (a --• a) (a --- a) ••• (a ••• a) 

mm m 

= (a m )((f) • ■ - (a m ) 

n 

= (a m ) n . 

Note that a 0 is not defined as a rule. For it if were defined we would still 
want property (/) to be satisfied; so in particular a 0 would satisfy 


a o £ f = a o+n = a n 


and 


(fa 0 = a n+0 = a n . 


Thus, a 0 would have to behave somewhat like an identity for multiplication. 
But the ring R , in which we are working, need not have a multiplicative 
identity, and in such a situation there is no obvious way to define a °. In 
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addition, a n need not be defined for negative n; since, if n * — 1 and condition 
(i) still holds, we have 

a~ l a l — a 0 = a l a~ l 

—so that a~ 1 cannot be defined unless a 0 is already defined. 

2-4-6. Remark. Consider the sum I” =1 a { of n elements of the ring R. In 
the special case where all the a t are equal—say, a = a 1 = a 2 = • • • = a n —it is 
customary to write na for £" =1 a t . Note that here n e Z, a e R, and na e R, 
but na is not a product of two elements of R ; however, we still refer to na as 
the product of a and (the integer) «. 

According to the meaning of the sum I we have 

la = a, 2a — (la) + a,..., na = (n — \)a -f a. 

Again, in analogy with the statement made for powers in 2-4-5, these rules 
could be used to define na recursively for all n> 1. Of course, na should still 
be thought of as the sum of n copies of a. The reason for using the notation 
na when adding (or the notation a n when multiplying) rather than some other 
notation, is that it obeys the usual kind of rules. Thus, we have 

(/) (m + n)a = ma + na , 

(ii) n(ma) = (nm)a 9 

(iiii) m(a + b) = ma + mb , for all a, b e R, m, n > 1. 

(iv) (ma)b = m(ab) = a(mb) 9 
(i;) (ma)(nb) =(mn)(ab) 9 

The proofs of these rules are rather trivial. Since na is the analog for addition 
of a n for multiplication, we see that (m + n)a is the analog of a m+n , and 
ma + na is the analog of (a m )(a n ). Thus, rule (i) is the additive analog of 
a m a n = a m+n —so it follows from the generalized associative law for addition. 
Similarly, rule (ii) is the additive analog of (a m ) n = a mn , and it follows from the 
generalized associative law for addition. Rule (iii) is an immediate conse¬ 
quence of the generalized commutative-associative law for addition, since 

m(a + b) = (a + b) + (a + b) + • • • + (a + b) 

— v* —^ 

m 

= (a + a + • • • + a) + (b + b + • • • + b) 

m m 

= ma + mb . 

To prove (iv), let a = a t = a 2 = * * * = a m and b = b x = b 2 = • • * = b m ; so in 
virtue of 2-4-3, 

( m \ m 

E a i) b = E = m ( afc ) 

i= 1 / i — 1 
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and 

( m \ m 

Z bij = £ (afef) = m(ab). 

Finally, to prove ( v ) one may take a = a t = a 2 = " • = a m and b = b x = 
b 2 = • • • = b n and use 2-4-4. 

2-4-7. Remark. Can we extend the definition of na so it is applicable for 
all n < 0, and in such a way that rules (i)-(i?) then hold for all integers m and n ? 
Suppose this has been done. Then putting n = 0 in (/), we have 

ma = (m + 0 )a = ma + 0 a 

which tells us that 


0a = 0 for all ae R. (*) 

(Note that here the 0 on the left side is in Z, while the 0 on the right side is in 
R.) Furthermore, if n < 0 we may put m = —n in rule (/) and obtain 

( — n)a + na = ((—n) + n)(a) = 0a~0 

which tells us that 

na = — (( — n)a) for all n < 0 and ae R. (**) 

In virtue of these observations, it is only natural that we should define na 
for n < 0 according to the rules (*) and (**). In particular, in the case n = — 1, 
we have 

(—l)a =—((—(— l))a) = -a, aeR. 

More generally, the use of (**) for the definition of na is permissible because, 
if n < 0, then ( — n) > 0, so (— n)a has meaning (as in 2-4-6), and then (— n)a 
has an additive inverse — (( — n)a) in R. This is not the only way one could 
define na when n < 0. An alternate method would be to put na = ( — n)( —a); 
but this definition is essentially the same as (**) because we have 

na = —((—n)a) = (—w)( — a), n < 0, a e R. (***) 

To check the validity of (***), we note first that if m > 0, then 

0 = m0 = m(a + ( — a)) = ma + m( — a) 

and consequently 

— {ma) = m( — a), m > 0, a e R. 

Now, when n < 0 we may apply this for m = — n > 0 to obtain — ((— n)a) — 
{ — n)a —so (***) does hold. 
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We know the meaning of na for every n e Z, and need to verify the rules 
(i)-(v) of 2-4-6 for all integers m and n. This is not hard. For example, 
consider rule {in): If m > 0, its validity was proved in 2-4-6; if m — 0, (Hi) 
is clearly valid; if m < 0 we have 

m(a + b) = {—m){—{a + b)) 

= (—m)((—a) + (-ft)) 

= + (— m)(— ft) 

= ma + mb 

so (in) holds in this case also. Similarly, rule (iv) holds for m > 0 and for 
m = 0; while if m < 0 we have 

(ma)b = ((—m)(—a))b 
= (- m )((-a)(b)) 

= {-m)(-{ab)) 

= m(ab) 

and in analogous fashion a(mb) = m(ab) —so rule (iv) holds for all me Z. 

The verification of rules (/), (w), and ( v) is straightforward but somewhat 
tedious because there are a number of cases to consider—depending on whether 
m and n are positive, zero, or negative. The details are left to the reader. 

There is still another familiar rule for computation which merits a reason¬ 
ably careful proof. 


2-4-8. Binomial Theorem. Suppose R is a commutative ring with unity e; 
then for any a 9 be R we have 

(a + bT= tfya’-V, n< 1. 

Proof : Before undertaking the proof some preparatory comments are in 
order. First of all, we need to discuss the meaning of the “ binomial coeffi¬ 
cients ” ("). One starts from the factorial symbol, which is defined as 

0! = 1, 1! = 1, 
and inductively, for any n > 1 

(n + 1)! = (n + 1)(«!). 

It follows that for any n > 1, 

n\ = (n)(n-\)(n-2)---(2)(\) 

—in other words, n ! is the product of all the positive integers from 1 to n . Now, 
for any n > 1 and any k satisfying 0 < k <n, let us define the binomial 
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coefficient (£) by 

("1 _ " ! 

\k) k«n-k)f 

For example, we have 


n> 1, 0 <k< n. 


a 

0 

( 7 ) 

u 


5-4-3-2-1 
“ (2-1H3-2-1) " 

7-6-5-4-3-2-1 
— (3*2-l)*(4-3-2* 1) 

= 35. 


'5\ 5-4-3-2-1 

5/ “ 5-4*3*2*l*(0!) ~ ’ 



More generally, one observes immediately that 

^ (n\ _ n(n - 1) • • • (n - fc + 1) 

W \k}~ 1*2*3 * fc 

<“> © = (")-■ 

(.-*)-(:)- 

All these facts are presumably familiar to the reader. Less familiar is the 
following useful relation between binomial coefficients 

(THtM*-i)- <*> 

Its proof consists of a straightforward computation—namely, 

tn\ 1 n \ nl _«!_ 

\kj ^\k-lj k\(n - k)\ (fc - l)!(n - k + 1)! 

nl(n — k -f 1) (n\)k 

= k !(n — k) !(n — k+ 1) + (fc - 1)!(fc)(n - k + 1)! 
n !(n — k + 1) + («!)fc 
(fc !)(n — k + 1)! 

(n!)(« — k + 1 + k) 

= (fc!)(« + 1 - fc)! 

(» + !)! 
fc!(n + 1 — fc)! 



170 


II. RINGS AND DOMAINS 


Incidentally, at this stage we do not yet know that (2) is always an integer, 
but the formula (#) may be used to give an easy proof of this fact. More 
precisely, if n— 1 , then the only possibilities for k are k = 0 and k = 1 ; since 
(o) = 1 and (}) = 1, we see that (£) is an integer when n = 1 and k is any integer 
satisfying 0 < k < 1. Now, suppose inductively that (£) is an integer for a 
fixed n and all k satisfying 0 < k <n. We must show that ("t *) is an integer for 
each A: satisfying 0 < k <n + 1. For k = 0 or n + 1 we have ("q 1 ) = (»+!) = 
so ("J 1 ) is an integer in these cases. Furthermore, for k satisfying 0 < k < 
n + 1, the relation (#) expresses (" V) as a sum of two terms, each of which is 
an integer by the induction hypothesis, so (" £ *) is an integer. We conclude that 
(£) is always an integer. 

Turning to the formula of the binomial theorem, 2-4-8, we may write out, 
or expand, the right side. Since (g) = (") = 1 and a 0 = b° = ee R, the result 
is 



Thus, what we have called the binomial theorem is indeed the standard 
statement, known to all—except that we deal with it in the general context of 
a commutative ring with unity. Now, let us give a formal proof, by induction 
on n . For n = 1, we have 

so the binomial theorem holds in this case. Next, suppose inductively that the 
binomial theorem holds for n. We must prove the validity of the binomial 
theorem for «+ 1. In other words, given 

(a + by = £ ^ a"-V, 


(a + b) n+i = "X | n + . ^ a” +i ~ i b i . 


we must show that 
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This is not hard—in fact, we have 
(a + b) n+i =(a + b)\a + b) 

=(;) +%(") + ("V^ 1 

= a n+1 + Z (") « B+1 " i & i + a" _(i_1) h (i_1)+1 +h" +1 

- (■;>"*• + ,£,[(") + (,-.)k"- w+ (::!)^‘ 

This completes the proof. | 


If the commutative ring R does not have a unity for multiplication, then 
a° and b° are not defined, but the binomial theorem still holds with only 
trivial modifications of notation. The nice symmetrical expression 0 Q)a n “ V 
cannot be used because when i = 0 or / = n we find terms containing b° and a 0 , 
respectively. So the appropriate form for the binomial theorem is 

(a + bf = a n + Q a n ~ 2 b 2 + •••+( n " J ab n ~ l + b”. 

Naturally, the proof follows the same principles as before, and it may safely 
be left to the reader. 

We conclude this section with an interesting application of the binomial 
theorem. 


2-4-9. Theorem (Fermat: 1601-1665). If p is prime, then 
n p = n (mod p) for all n e Z. 

To put it another way 

a p = a for every a e Z p . 


Proof : Consider the first assertion. It is clearly true for n = 0 and n = 1. 
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Suppose, inductively, that the assertion holds for the positive integer n\ we 
must prove its validity for n + 1. According to the binomial theorem, 


(n + 1 y = n p + 


+ 




Now, let us look at the binomial coefficients (£) for k = 1,2, — 1. We 

know that 


(p\ = p(p - i) • • • (p - fc + i) 

W 1-2 • • • k 

is an integer. Since the prime p appears in the numerator and all the terms in 
the denominator are positive integers less than p , it follows that the integer 
(£) is divisible by p . Consequently, all the terms on the right side of (*), except 
for the two end terms, are divisible by p; so we have 

(n + l) p = n p + l p = n + 1 (mod p) 

(note the use of the induction hypothesis) and our assertion holds for n + 1. 
Therefore, Fermat’s theorem holds for all n > 0. 

What if n is negative ? If n < 0, then surely there exists an integer m > 0 
such that n = m (mod p). (In fact, by applying the division algorithm to n and p , 
we see that m may be taken to satisfy 0 <m<p.) Then, in virtue of the fore¬ 
going and the obvious fact (see 2-4-10,) that raising two congruent integers to 
the same power preserves the congruence, we have 

n p = m p = m = n (mod p). 

This completes the proof of the first assertion. 

Finally, consider any element ae Z r According to the meaning of residue 
class (mod p ), a can be written in the form a = |_«J p for some n e Z. Because 
n p = n (mod /?), we have 1 n p | p = \_n \ p , and therefore, 



This completes the proof. | 


2-4-10 {PROBLEMS 

1. Prove, in complete detail, that in any ring 


#3 “h #5 + di + #4 + #2 = 4“ d 2 + #3 + #4 + #5 « 


How many such rearrangements of a u a 2 , # 3 , a 49 a 5 are there? 
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2 . 


3. 


4. 


5. 


6 . 


7. 


8 . 


9. 


10 . 


11 . 


12 . 


13 . 


For elements of a commutative ring, state and prove the generalized 
commutative-associative law for multiplication. 

We proved 2-4-4 by adding the various rows of a certain rectangular 
array with m rows and n columns. Prove this result in another way— 
namely, by what amounts to summing the columns of the rectangular 
array. 

Supply the missing proofs for the rules discussed in 2-4-7. 

Suppose a and b are integers with a = &(mod m). Show that for all n > 1 
we have 

a n = b n (mod m) 

and for all n e Z we have 

na = nb (mod m). 

Even more, if fix) = c 0 + c t x + ••• + c r x? with c 0 , c u ..., c r e Z, then 
f(a) — fib) (modra). 

If a belongs to the ring R and n is an integer greater than 0, is na equal to 
i-n)i-a)l Why? 

In any ordered domain D , prove by induction: 

(/) a t 2 + a 2 2 +-b a n 2 > 0, and a 2 + af 2 + • • • + a 2 = 0 o a t = 

a 2 = -" = a n = 0. 

iii) If a is negative, then any odd positive power of a is also negative. 
iiii) If a < b and n > 0 is odd, then d % < b n . 

In an ordered domain, show that a 9 = b 9 implies a = b. Does this result 
apply for any positive odd power? How about even powers? 

In 2-4-8 we used induction to prove that the binomial coefficient (£) is 
always an integer. For each (£) consider the point («, k) in the plane. Use 
the collection of all such integral points to explain the idea on which we 
based the proof that (2) is always an integer. 

In the statement of the binomial theorem, why must R be commutative? 
What happens if R is not commutative? What is ia + b) n in this situation? 

For a, b , c in the commutative ring R , what is ia + b + c) n ? 

In the commutative ring R , compute 
(0 (2a + 3Z>) 5 , 0?) i2a-3b) 7 . 

An element a^ 0 of the ring R is said to be nilpotent if there exists a 
positive integer n for which a n = 0. 

0) Show that in an integral domain there are no nilpotent elements. 

00 Can you find any nilpotent elements in Mi Z, 2)? 
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14. Evaluate the following binomial coefficients 

<o (”), m (V s ). 

<» (T)- « 

15. Prove, by induction on n, that a set with n elements has exactly 2" subsets. 

16. (f) A set with n elements has exactly (*) subsets which consist of k 

elements (when, of course, 0 <k <ri). 

(ii) Use the binomial expansion of 2" = (1 + 1)” in conjunction with part 
(/) to conclude that a set with n elements has exactly 2” subsets. 

17. Suppose D is a well-ordered domain and we are given elements a > 0 and 
b of D. Prove the existence of an integer n for which na > b. In other 
words, prove that a well-ordered domain satisfies what is commonly 
known as the Archimedean property. 

18. Suppose R is a ring with unity e , and a is an element of R which has an 
inverse (denoted by a ~ l ) for multiplication. Define d 1 for all n e Z. State 
the multiplicative analogs (where meaningful) of the five rules about na 
which were discussed in 2-4-6 and 2-4-7. Examine these multiplicative 
analogs and decide which ones are valid (and under what hypotheses); 
then prove the valid ones. 

2-5. Characterization of the Integers 

When should we consider two rings to be the “ same ” ? Must they be fully 
identical ? Several instances related to this kind of question have already come 
up, but they were glossed over at the time. For example, in 2-2-9 it was noted 
informally that the ring of Gaussian integers Z[i] = {a + bi\ a, b e Z} (in 
which the operations are the standard ones for complex numbers) is “essenti¬ 
ally the same ” as the ring gotten by taking the set Z x Z = {(a, b) | a, b e Z} 
and defining the operations of addition and multiplication by 

( a , b) + (c, d) = (a + c 9 b + d\ 

(a, b ) • (c, d) = (ac — bd , ad + be ). 

The basic difference is in notation; nothing is really affected by writing {a, b) 
for a + bU or vice versa. 

Furthermore, in 2-2-13, the set of all subsets of a given set X , was 

made into a ring by setting up a 1-1 correspondence between its elements and 
the elements of the ring Map(A", Z 2 ) and then transferring the ring operations 
bodily from Map(A", Z 2 ) to X ). Obviously, these two rings should be 
considered to be the “ same.” 
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What are the essential ingredients of sameness ? Clearly, it depends on the 
nature of the objects under discussion and on which of their properties we 
choose to focus. (For example, during political campaigns one often hears the 
statement that two opposing candidates are the “same.” This is surely a 
matter of definition; it depends on which characteristics or positions of the 
candidates are counted and which ones are ignored.) If we are dealing with 
rings, then, as indicated by the examples mentioned above, two rings may be 
considered to be the 66 same ” even though they are different sets. On the other 
hand, we have seen (in 2-2-9, for example) that it is often possible to make a 
given set, such as Z x Z, into a ring in several apparently different ways. 

Thus, for a ring, it is not the underlying set alone which matters. In fact, 
we recall that when being fussy about the definition of a ring, we referred to it 
as a triple {R, +, •} consisting of a set R and two operations + and • which 
satisfy certain axioms. In other words, there are three features which distin¬ 
guish a ring—the set of its elements and the two operations. After all, to define 
a ring these are the three things which must be specified. Consequently, for 
the “ sameness ” of two rings, it is natural to require that we have, simultane¬ 
ously, the sameness of the two sets, of the two operations +, and of the 
two operations •. Now, let us turn to the clarification of what is meant by 
sameness for sets and for operations. 

2-5-1. Remark. In the context of our discussion, it should occasion no 
surprise that two sets are considered to be the same when there is a 1-1 
correspondence between them. This is a familiar notion, but it is useful for us 
to develop it carefully. 

Consider two arbitrary sets X and Y. By a function or mapping/from X to 
Y (which we denote by/: X-> Y) we mean, as indicated earlier in 2-2-12, a 
rule which assigns to each element xe Xa unique element f(x) of Y. In other 
words, one may think of a function as a black box or machine: every time one 
throws in an element x of X out comes an element f(x) of Y. We write 
f \ x->f(x) or simply x -»/(*), and refer to the element f(x) as the image of x 
under the mapping/, or as the value of/at x. Note that a function/must be 
defined for every x e X; if there is even a single x for which there is no assigned 
image f(x), then / is not a function. Furthermore, to each xe X there is 
assigned exactly one element f(x); if there appear to be several possibilities 
for/(*), then, in order for/to be a function, exactly one of these possibilities 
must be specified as the value of/(*). 

Extending the notation used in 2-2-12, we denote the set of all mappings 
from X into 7 by Map(A", Y). In keeping with 2-2-12, two mappings from 
X into Y are equal when they take equal values for every element of X —sym¬ 
bolically: For / g e Map(X, 7), 

f = g o f(x) = g(x) for all xe X. 
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Suppose we are given a function /: X-> Y. Let f(X) denote the set of all 
elements that arise as images of elements of X under f— in other words, 

f{X) = {f(x)\xeX). 

Thus, f(X) is a subset of 7, and it is known as the image of X under /, or 
simply as the image of / When there is no danger of confusion, we denote it 
by imf. As a rule f(X) # Y ; but when f(X) = 7, that is, when every element 
of Y is the image under/of some element of X , the mapping/: X-> Y is said 
to be onto or surjective. Thus, / is surjective if and only if given any yeY 
there exists an xe X such that /(x) = y. 

Continuing with an arbitrary function /: X 7, it may well be that two 
distinct elements of X have the same image—that is, x x ^ x 2 , but /( x J = 
f(x 2 ). When this is not the case, that is, when /has the property that distinct 
elements of X always have distinct images under /, then / is said to be one to 
one or injective. Thus, / is injective if and only if it satisfies the following 
condition for any x l9 x 2 e X: 

If x x # x 2 , then /(*i) # /(x 2 ). (*) 

Of course, this condition may be reformulated as follows: 

If /(*i) =/(x 2 ), then x t = x 2 . (**) 

Note that according to our definition/is not injective if and only if there exist 
distinct elements of X which have the same image under /. 

2-5-2. Examples. (1) Suppose X = R, Y — R and let/: R R be given by 

/(x) = V*, xeR. 

In other words,/is the rule which assigns to each real number x its square root 
y/x. (Of course, by square root we always mean the positive square root.) 
Unfortunately, if x is negative, then it has no square root—meaning that there 
is no real number whose square is x. Thus,/is not defined for every xeR, 
so according to our definition,/is not a function. 

(2) Suppose X = C, Y = C and let/: C -+ C be given by 

f(z) = sjz, zeC. 

In order for / to be a function we must, first of all, be certain that every 
complex number has a square root; in other words, given any z = a + bie C 
we must be certain of the existence of real numbers x and y such that 

(x + yi) 2 = a + bi. 
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This amounts to solving the simultaneous equations 


x 2 — y 2 = a , 


2 xy = b 


(#) 


for real numbers x and If we put 
sign b ■ 


={- 


for b > 0, 
for b < 0, 


and read this symbol as the “ sign of b” then the reader may easily derive (or 
simply check) the fact that a solution of these equations is given by 

x = J a + + &2 , y = (sign b)J ~ a +-Ja 2 + b 2 (##) 


Note that all these square roots have meaning because the expressions inside 
jthem are always greater than or equal to 0. The need for sign b arises in con¬ 
nection with the second equation, 2 xy = b. Under our convention about 
always taking positive square root, 

yjb 2 = |Z>| and (sign b)yjb 2 = b 

and consequently the given values for x and y satisfy both of the equations (#). 

We now know that every complex number has a square root. But if x + yi 
is a square root of a + bi, then so is — x — yi. Moreover, for all we know there 
may be other square roots of a + bi. Thus, we are in some difficulty with 
regard to which square root of z = a + bi to take as f(z). One way out is to 
take yjz to be x + yi where x and y are given by (# #). Another way is to make 
no choice for yjz; in this case,/if not a function. 

(3) Suppose X = Z, Y = Z and let/: Z -> Z be given by 

f(x) = 2x, xe Z. 

Surely,/is a function which maps each integer to twice that integer. In particu¬ 
lar, /(1) = 2, /(5) = 10, /(0) = 0, /(79) = 158, /(-1) = -2, /(-17) = -34. 
The image of / is 

f(Z) = {f(x)\xe Z} = {2x\xe Z} = 2Z 

—the set of all even integers. Therefore, any odd integer is not in the image of 
/ and /is not surjective. On the other hand,/is injective; for if m and n are 
integers which have the same image under / then m = n 9 because 

f(m) = f(n) => 2m = 2n => m = n 

—so criterion (**) for injectivity is satisfied. 

(4) Consider the mapping/: Z -► Z defined by 

f(x) = x 2 . 


xe Z. 
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Thus,/maps each integer to its square; it is surely a function in terms of our 
usage of this word. Since/(— 1) = /( +1) = 1 and, in general,/(—x) = /(x) 
for all x e Z, we see that / is not injective. Furthermore, the image of / 

/(Z) = {x 2 \xe Z} 

consists of all integers which are perfect squares; consequently, / is not 
surjective. 

(5) Consider the “squaring” function/: C -> C; in other words 

/(z) = z 2 , z e C. 

Is/surjective ? The image of/is {z 2 | z e C}, the set of all squares of com- 
plex numbers. But we have seen in (2) above that every complex number has 
a square root; or, to put it another way, every complex number can be ex¬ 
pressed as the square of some complex number. Therefore, / is surjective. 

Is/injective? Obviously, the answer is no, because/(— z) = f(z ) for all 
Z E C. 

(6) Consider the mapping/: Z -► Z defined by 

/(x) = x -F 2, x e Z. 

Given any n e Z, we have f(n — 2) = «, so/is surjective. Since /(x) = f(y ) =► 
x -F 2 = y + 2 => x = y, we see that / is injective. 

2-5-3. Remark. Suppose we have three sets X, Y , Z and two mappings 
/: X-> Y and g\ Y-+Z. Then we may define the composite (or composition) 
function 


gof: X^Z 

by taking 

(g °f){x) = g{f{x)) for all xeX. 

The name “composite” is used because we “compose” the two mappings g 
and / to obtain g of; this means that g of amounts to applying the mapping 
/ and then applying g to the result. It is convenient to describe this situation 
by the “ picture ” 

/ 9 


X -> Y - >Z 



9 ° / 


or by the phrase: If /eMap(A", Y) and #eMap(F, Z), then we have 
g o/e Map(Z, Z). 
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Two simple properties of the composite function g o/are: 

(i) If both / and g are surjective, so is g ©/. 

(ii) If both / and g are injective, so is g ©f 

Proof : (/) Consider any zeZ. Since g: F-»Z is onto, there exists 7 
such that g(y) = z. Then, because /: X -» F is onto there exists rel such that 
/(*) = y- Consequently, ( g © /)(*) == z — so g of is surjective. 

(//) Suppose x t and x 2 are distinct elements of X— so x t # x 2 . Because / 
is injective, we have/fo) f(xf)\ and then because # is injective, we have 
g(f(x i)) # g{f{x 2 )). Thus, 

# ^2 =► °/)(*i) / (g 

—so g <>f is injective. | 


2-5-4. Remark. Suppose we are given a map /: Z -► F which is both 
1-1 and onto; in other words,/is both injective and surjective. Let us take 
any element y e Y. Because/is surjective, there exists xe X for which f{x) = y . 
Moreover, this element x is uniquely determined by y. In fact, if x' e X is also 
and element for which f(x') = y 9 then, because / is injective, f(x) = f(x') 
implies x = x'. Therefore, we may define a mapping 

g: 

by putting 

g(y) = x, ye Y 

where x is the element determined by y according to the procedure described 
above. In other words, g and / are connected to each other by the relation 

x=g(y ) o y=f(x). 

Our situation may be described by the pictures or diagrams 

f 9 9 f 

X - >Y - > X 9 Y -► X - >Y 

and both of these can be coalesced into the diagram 


/ 


X Y 


Now, let us examine the two composite maps 

g °f: X ^ X and fog:Y-+Y. 
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For any xe X, we have 

(ff °/)0) = ff(f(x)) = g(y) = x 

which says that applying/(to x) and then following it by g takes us back to the 
element we started with (namely, to x). As is customary, let us denote the 
“identity map” of X (meaning the mapping which carries every element of 
X to itself) by 1: X-+ X— so 

l(x) = x for all xe X. 

In virtue of this notation and the meaning of equality of mappings, we have 
proved that 


g °f — 1 (the identity map of X). (*) 

Similarly, for any y e Y we have 

(/° g)(y) =f{g(y )) =/(*) = y 

so that 

f o g = 1 (the identity map of Y). (**) 

Because properties (*) and (**) are satisfied, we say that the mapping g is an 
inverse of/. One often writes/“* (and reads it as “/inverse”) instead of g; 
so when /is injective and surjective we have /°/ -1 = 1 ,/ -1 °/= 1. 

Incidentally, it should be noted that the inverse map g: YX constructed 
from our surjective and injective map f:X->Y is also surjective and injective. 
In fact, given any xelwe have, according to (*), g{f(x)) = x —so f{x) is an 
element of Y whose image under g is x, and hence g is surjective. In addition, 
ff(yi) = g(ji) implies f(g(yi)) =f(g(y 2 )), which says that y t = y 2 — hence g is 
injective. 

Once we know that g is both surjective and injective, the foregoing discus¬ 
sion can be applied to g\ in particular, g has an inverse, and the reader may 
easily convince himself that / is an inverse of g. (In this connection see 
Problem 11 at the end of this section.) 

2-5-5. Definition. There is said to be a 1-1 correspondence between the 
sets X and Y when there exists a mapping/: X Y which is both injective and 
surjective. This mapping/is then called a 1-1 correspondence between X and 
Y. 

This mapping/need not be unique; once X and Y are in 1-1 correspon¬ 
dence, there may be many 1-1 correspondences between them. 

Our definition is somewhat awkward because it distinguishes between X 
and Y in the sense that the mapping goes from X to Y. However, the definition 
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could be stated, equivalently, in terms of a mapping from Y to X because, as 
seen in 2-5-4, there exists a mapping of X-> Y which is surjective and injective 
if and only if there exists a mapping of Y-> X which is surjective and injective. 
The important thing is that when the sets X and Y are in 1-1 correspondence 
the mapping describing this correspondence may be taken in either direction. 

We have settled the meaning of “sameness” for sets, and now turn our 
attention to the meaning of sameness for two rings { R , +, •} and {R\ -f, •}. 
As indicated earlier this requires, simultaneously, the sameness of the two sets 
R and R\ the sameness of the two operations +, and the sameness of the two 
operations * . (There is no great danger in using the same symbols + and • for 
the operations in both rings.) Thus, we must have a 1-1 correspondence 
between the sets R and R' under which the operations in R and R' correspond 
to each other. In more detail, suppose we denote the 1-1 correspondence by 

R<->R' 

and write the correspondence between elements as 

a<->a\ aeR , a'eR\ 
b<->b\ beR , b'eR'. 

Then correspondence of the operations in R and R' means that the result of 
adding (or multiplying) two elements of R corresponds to the result of adding 
(or multiplying) the two corresponding elements of R'. Expressing this in 
symbols, we have 


a + b<r->a' + b' and ab^a'b'. (#) 

Now, according to our notation, given an element of R , the unique element 
corresponding to it is denoted by adjoining a prime ('). In particular, we have 

a + b<-> (a + b) f and ab (ab)'. 

Consequently, because the correspondence is 1-1, (#) tells us that 

(a -f by = a' -f b' and (ab)' = a'b' 

for all a, b e R. These last two relations express the sameness of the operations 
in R and R'. 

From the point of view of algebra, it is more important to focus on the 
sameness, or preservation, of the operations than on the 1-1 correspondence 
between sets. This is the motif underlying the following definition. 
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2-5-6. Definition. Let {i?, + ,*} and { S , + , •} be rings. A mapping 
<j): R -► S is said to be a homomorphism of R into S when 

(j>(a + b) = (j>(a) + (j)(b ), 

and 


(j>(ab) = (j)(a)(j)(b), 

for all a, be R. If, in addition, the mapping is both injective and surjective 
(that is, if it is 1-1 and onto), then (j) is said to be an isomorphism of R onto S. 

Thus, a homomorphism (/> : R-+ S is a mapping under which the ring 
operations are preserved. A homomorphism provides us with a tool for com¬ 
paring two rings. Even more, we observe that the notion of an isomorphism 
(j) of R onto S is the precise mathematical formulation of our vague notion of 
“ sameness” for rings. In fact, when (j) is an isomorphism, suppose we denote 
S by R' and for any a e R write <j)(a) = a '—then a <-> a' is a 1-1 correspondence 
between R and R' which expresses our intuitive notion of sameness for the 
rings R and R'. 


2-5-7. Examples. (1) Consider the mapping Z-> Z defined by 

= 2«, n e Z. 

In 2-5-2, part (3) we observed that (j> is injective but not surjective. Is 0 a 
homomorphism? For any m, n e Z we have 

( p(m + n ) = 2{m + n) = 2m + 2n = 4>(m) + 4>(n) 

which says that (j) preserves addition. On the other hand, for any m,ne Z 

( p(mn ) = 2 mn and 0(m)0(«) = (2m)(2n) = 4 mn. 

Since these are not always equal, (j) does not preserve multiplication; therefore, 
0 is not a homomorphism. 

We note in passing: If we modify the situation slightly and consider the 
map (j ): Z -»2 Z defined by (j)(n ) = 2 n, then (j) is injective and surjective but not 
a homomorphism. 

In the same vein, let the mapping 0: Q -> Q be given by 
</)( a) = 2a, a e Q 

—or what is the same, since a can be written as mjn where m, n e Z and 
n # 0, (f)(m/n) = 2mjn. Then (j) is surjective, because given any mjn e Q we 
have <j)(ml2n ) = m/n; and (j) is injective, because mjn ^ m'lri implies (j)(m/n) # 
<j)(m'/ri) since 2 m/n # 2m'/n'. In addition, (j) is not a homomorphism—one 
checks easily that (j> preserves addition, but not multiplication. 
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(2) Consider the mapping 0: Z -► Z given by 

0(a) = a 2 , ae Z. 

As noted in 2-5-2, part (4) 0 is neither surjective nor injective. Is it a homo¬ 
morphism? Multiplication is preserved since for a, be Z 

0(a6) = (ab) 2 = a 2 6 2 = 0(a)0(6) 

but addition is not preserved, since 

0(a + b) = a 2 + lab -f b 2 and 0(a) -F 0(6) = a 2 + b 2 

—hence 0 is not a homomorphism. 

(3) Suppose R and 5 are any rings. We may always define a mapping 
0: i? -» 5 by 

0(a) =0 for all a eiL 

Clearly, 0 is a homomorphism. It is known as the trivial homomorphism (or as 
the zero map) and is not of much interest—no information about the rings R 
and S can be gleaned from the trivial homomorphism. 

(4) Let us fix an integer c # 0 and define the mapping 0: Z -» Z by 

0(x) = x + c, x e Z. 

In particular, 0(0) = c, 0(5) = 5 -F c, 0( — 9) = — 9 + c; the action of 0 
amounts to adding c to any integer. Now, 0 is injective because for x,ye Z 

0(x) = 0(y) =>x + c = }> + = y 

and 0 is surjective because given any e Z we have 

<l>(y-c) = y. 

In addition, 0(x + ^) = x + > y + c and 0(x) -F 0(x) = x + y -F 2c so (because 
c #0)0 does not preserve addition; and 0 does not preserve multiplication 
since <j)(xy) = xy + c and 0(x)0(y) = xy -F cx -F cy -F c 2 . 

(5) Fix an integer m > 1; then we can always define a natural mapping 
0 : Z -► Z m by 

<Ka)=|aJ m > ae Z. 

In other words, 0 maps each integer to its residue class modulo m, and it is 
often referred to as the residue class map mod m. 

In the first place, 0 is a homomorphism, because for any a, be Z we have 

0(a + b) = \ a + b\ m = [a| m + |_6j m = 0(a) + 0(6), 
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Thus, 0 is a homomorphism precisely because of the way in which addition 
and multiplication are defined in Z m . 

Moreover, (f) is surjective—in fact, given any a e Z m we may write it in 
the form a = \_±\ m with ae Z, and then (j)(a ) = a. However, (f) is not injective— 
in fact, if a' = a (mod m) (that is, if a ' is an element of the residue class 1 a L), 
then cj)(a ') = <j)(a) because |_«J m = |j*J m . 

(6) Let us fix integers m > 1, n > 1 and consider the rings Z m , Z„ and the 
direct sum Z m © Z„. As the reader will recall, (see 2-2-11) the ring Z m © Z„ 
consists of all ordered pairs (a, /?) where a e Z m , /? e Z„, and the operations 
are componentwise. We define a mapping (f ): Z -> Z m © Z„ by putting 

«6Z. 

In particular, when m = 5, n = 6 the action of <j> is of form <p(a) = (| a [ 5 , [ a | 6 ) 
—so, for example, 

«2) “(Ills'111.)' 

■K 5 ) “(Ills' lij«) “(Ills • 111.)' 

0(32)-(| 32].. |32|.) “(|2J S • HJ.)' 

*- 10 > “(bills' bills) “(Ills. Ills). 

Because of the way the operations are defined in residue class rings and in 
direct sums, the mapping (j ): Z -► Z w © Z„ is a homomorphism. In more 
detail, for a, b e Z we have 

(j)(a + b) = ( \a + b \ m , |a + b | J 

= m + m, 

and 

4>(ab) ={\ab\ m , |ab |J 

=(lil-liJ»* Kl*D 

The mapping (f) is not injective; for example, (f)(0) = ([_0j w? [_0j„) = (0, 0), the 
zero element of Z m © Z n , and also <f)(mri) = ( j mn \ m , | mn | H ) = (0, 0). 

The question of whether (f) is surjective is more complicated but extremely 
interesting. It asks: Given any element (a, j 8) e Z m ® Z„ which we may, of 
course, write in the form (|j0_| OT , does there exist c e Z for which <f>(c ) = 
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(LfJm^lAU— that is > for which (l c L,l c L) = (l g LJ^L)?Thus, we seek 
ce Z for which both 

LfJm=H m and t£j« =IAI« 

—or, translating to the language of congruences, we seek ce Z such that 

c = a (mod m) and c = b (mod n). 

Can this be done? Obviously, if m = n and a^b (mod m ), then no such c 
exists, and (j) is not surjective. However, in the general case, when m, «, a, b 
are arbitrary, the tools required for settling the question of existence of c are 
not yet at our disposal. At this stage, therefore, we are unable to decide if, or 
when, (/) is surjective. 

(7) Fix integers m l > 1, m 2 > 1 and consider the direct sum ring 
Z mi ® Z m2 . We may define mappings 

/l * and / 2 : Z m2 —> Z mi © Z m2 

by putting for a t e Z mi , oc 2 e Z m2 

/i( a i) = ( a i> 0) and / 2 (oc 2 ) = (0, oc 2 ) 

—or, writing this another way, 

^ (KJ = (K..* °) and ^(LiJ J = (0, 

Each of these maps is clearly an injective homomorphism, but not surjective. 
In addition, we may define mappings 

%mi © ^mi and • ^mi © ^m 2 ^m 2 

by putting 

0i(«i, oc 2 ) = and g 2 (a l9 a 2 ) = a 2 

or, what is the same thing, 

[ik)= kk and ^(Ld-v Ilk) = • 

Each of these maps is clearly a surjective homomorphism, but not injective. 

For i = l,2 one often refers to as the injection on the ith coordinate and 
g { as the projection on the ith coordinate. 

Naturally, the notions of injection and projection may be extended to the 
case of a direct sum Z mi ® Z m2 ® •••© Z mn where n> 2. Even more, if 
R u R 2 ,..., R n are any rings, we may associate with the direct sum R t ® R 2 
®"-®R„ injective maps/i,/ 2 ,... ,/„ and projection maps g l9 g 29 ... 9 g n . 
The details are left to the reader. 

(8) Consider the domain of complex numbers C. A generic complex 
number is of form z = x + yi where x and y are real numbers. As is well 
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known, with each complex number z = x +yi we may associate its conjugate 
z = x — yi . Among the standard properties of conjugation (which the reader 
may easily verify if he is not familiar with them already) we have, for all 
z, z u z 2 in C: 

(0 zz = x 2 + y 2 , 

(ii) I = z, 

(ill) Zjl + z 2 = z t + z 2 , 

(iv) z^zl = z t • z 2 . 

In words, the last two properties say that the conjugate of a sum is the sum of 
the conjugates and that the conjugate of a product is the product of the con¬ 
jugates, respectively. 

Now, consider the mapping 0: C -► C defined by 
0(z) = z, zeC. 

Naturally, 0 is known as the “conjugation map.” Let us verify that (j> is an 
isomorphism. It is surjective because, given any z e C we have 

<Kz) = z = z, 

so z is the image of z under (j). It is injective because 

4>i z l) = 4>(z 2 ) => z x = z 2 

=> Zi = z 2 
=> z i = z 2 . 

Finally, 0 is a homomorphism because for any z l9 z 2 e C 

^(z t + z 2 ) = z x + z 2 = Z X + Z 2 = 0(z t ) + <t>(z 2 ), 

0(ZiZ 2 ) = z^ = z t • Z 2 = 0(Z!) • <j)(z 2 ). 


(Thus the statement “ 0 is a homomorphism ” is essentially another way of 
expressing the fact that the conjugate of a sum, or product, is the sum, or 
product of the conjugates.) 

For an arbitrary ring R , any isomorphism 0: R-+ R (that is, any isomor¬ 
phism of R with itself) is said to be an automorphism of R. Thus, the foregoing 
discussion can be summarized by the statement: conjugation is an auto¬ 
morphism of C. 

(9) Let R — R, 2), the ring of all 2 x 2 matrices with entries from R. 
Let S denote the subset of R consisting of all elements of form 
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For example, 



are elements of S. It is easy to see that the nonempty set S' is a subring of R. 
In fact, taking ~ d c ), c, de R , which is also an element of S , it suffices, accord¬ 
ing to 2-2-7, to observe that 

la ~b\ _ ( c ~d\ _( a ~ c ^ 

\b a) cj ~\b — d a — c / 

and 

la — b\ 1c — _ lac — bd —{ad -f $ 

a) cj \ad + be ac — bd) 

[This is identical with what was done in 2-2-10 for the subset of M{ Z, 2) 
denoted there by T.] 

Now, let us define the mapping (j>: C -► S by 

4>{a + bi) = {^ ~ b ^j, a, be R. 

The map (j) is injective (since distinct complex numbers obviously map under 
(/) to distinct matrices) and surjective (since from any given matrix belonging 
to S we can immediately produce the complex number whose image it is 
under (j)). Moreover, 0 is a homomorphism because 

</)({a + bi) + (c + di)) = (j)({a + c) + (b + d)i) 

_ la + c — {b + d)\ 

+ rf a + c ) 

and 

«+«*+<*,= (£ "*) + (, 

= /a + c ~(b + d)\ 

\b + d a + c ) 

while 

(j){{a -F bi)(c -F di)) = (j)({ac — bd) -F {ad -F bc)i) 

_ lac — bd —{ad -F 6c)\ 

\ad A- be ac — bd ) 

and 

«<. + W)«c+ ,*)=(“ -*)(' -f) 

_ lac — bd — {ad -F &c)\ 

\ad A- be ac — bd / 
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Consequently, 0 is an isomorphism of C onto S , and S is just a disguised 
version of the complex numbers. 

If we have an isomorphism between two rings, then they are essentially the 
same and, as indicated earlier, if we have a homomorphism of one ring into 
another, then their respective operations are related. Our next results describe 
how two rings are related under a homomorphism. 


2-5-8. Proposition. Suppose R and S are rings and (j> : R-+ S is a homo¬ 
morphism; then 

(1) m = o, 

(2) <j>(-a)= —(0(a)), aeR, 

(3) <t>(a-b) = 0(a) - <j>(b), a, beR, 

(4) if R has an identity e and 0 is surjective, then S has an identity— 
namely, 0(c). 

Proof: (1) It should be noted that the 0 in 0(0) is the zero element of R 
and the 0 on the right side is the zero element of S ; there is no serious danger 
of confusion. As for the proof, we have 

0(0) = 0(0 + 0) = 0(0) + 0(0), 

which implies that 0(0) is the zero element of S. 

(2) From the relations 

0 = 0(0) = <f>(a + (-a)) = 0(a) + 0(-a), 

it follows that 0( — a) is the additive inverse of 0(a) in the ring S; in other 
words, 0( — a) = —(0(a)). Of course, the “minus” sign has two meanings; 
in 0( — a) it denotes the additive inverse in R , while in — (0(a)) it denotes the 
additive inverse in S. 

(3) For any a,be R we have 

0(a - b) = 0(a + {-b)) = 0(a) + 0(-2>) = 0(a) - 0(6). 

(4) By hypothesis, ea = a = ae for all aeR. Now consider any element 
s e S. We need to show that (j>(e)s = s = scj)(e). Because 0 is surjective there 
exists an element ce R for which 0(c) = s. Since ec = c = cc, application of 
0 gives 0(cc) = 0(c) = 0(cc) and 

0(e)0(c) = 0(c) = 0(c) 0(c), 

which says that <j)(e)s = s = scj)(e). | 
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2-5-9. Proposition. Suppose 0: R -► S is a homomorphism of rings. Then: 

(1) The image of 0, 0(i?), is a subring of S. 

(2) If we define the kernel of 0 (and denote it as “ker 0 ”) by 

ker 0 = {a e R \ 0(a) = 0}, 

then ker 0 is a subring of R. 

(3) The map 0 is injective o ker 0 = (0). 

(4) If 0 is injective, then 0 is an isomorphism of R onto (j)(R ). 

Proof: (1) Consider any two elements of <j>(R)\ they are of form 

Si = (j)(a ), 5- 2 = 0(6), a, be R, s u s 2 e S. 

Since s t — s 2 = 0(a) — 0(6) = 0(a — b)e (j)(R) and s t s 2 = 0(a)0(6) = 0(a6) e 
(j)(R), we see that 0(i?) is closed under subtraction and multiplication and, 
therefore, (f)(R) is a subring of S. 

(2) Suppose a and b are any elements of ker 0, so according to the defini¬ 
tion of kernel we have (j)(a ) = 0 and <j>(b) = 0. To prove that ker 0 is a subring, 
it suffices to show that a — be ker (j> and ab e ker (j) ; and the way to accom¬ 
plish this is to apply (/> to a — b and ab. We have then 

(j)(a -b) = (j)(a ) - <t>{b) = 0, 

<t>{ab) = mm = 0, 

so a — b and ab are in ker 0, and ker 0 is a subring of R. 

Actually a stronger statement holds about the kernel of a homomorphism, 
but we will not investigate it at this point. 

(3) The element 0 of R always belongs to the kernel of 0, since 0(0) = 0. 
Now, suppose ker 0 = (0)—meaning that ker 0 consists of the zero element 
alone. Then for a, be R, we have 

m = m => 0(« - b) = 0(a) - m = o 

=> a — be ker 0 
=> a — b — 0 
=>a — b , 

so 0 is 1-1. This proves the implication <=. 

To prove the implication =>, suppose conversely that 0 is 1-1. If a e ker 0, 
the 0(a) = 0 = 0(0), and by one-to-oneness a = 0. Thus, 0 is the sole element 
of the kernel or, what is the same thing, ker 0 = (0). 

This result is quite useful. In order to decide if a homomorphism 0 is 
injective, it suffices to determine which elements are the ones which map to 0 
under 0; in particular, if 0(a) = 0 implies a = 0, then ker 0 =(0) and the 
homomorphism is injective. 
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(4) According to part (1), <j)(R) is a subring of S. Clearly, </> : R-* 0(i?) is 
still an injective homomorphism. But it is also surjective, so 0 is an isomorph¬ 
ism of R onto 0 (jR). | 

This last result is often convenient—for example, in connection with 
Example (9) of 2-5-7. More precisely, consider the mapping 0: C -► Ji( R, 2) 
defined by 

■*)• 

As before, one verifies that 0 is a homomorphism. Since (q q) is clearly the 
zero element of ^(R, 2), it is immediate that ker 0 = (0)—so 0 is injective. 
Therefore, according to our last result, 0 is an isomorphism of C onto 0(C). 
Obviously, 0(C) is precisely the set S defined in 2-5-7, part (9). In particular, 
there is no need to provide a separate proof that S is a subring of ^(R, 2), as 
was done there—this fact is automatic in virtue of part (1) of 2-5-9. 


2-5-10. Proposition. Suppose R , S , T are rings. Suppose we are given 
mappings 0: R-+ S, 0: S-+T and consider the composite mapping 
0 o 0: R -► T defined, as usual, by 

° = i^(<j>(a)), aeR. 

Then: 

(/) If both 0 and 0 are 1-1, then so is 0 o 0. 

(ii) If both 0 and 0 are onto, then so is 0 o 0. 

(iii) If both 0 and 0 are homomorphisms, then so is 0 o 0. 

(iv) If both 0 and 0 are isomorphisms, then so is 0 o 0. 

(i?) If 0: jR -► S is an isomorphism, so is 0 -1 : S R. 

Proof: (/) and (//) were proved in 2-5-3. 

(iii) We must show that 0 o 0 e Map(i?, T) preserves both addition and 
multiplication. These properties do indeed hold because, for any a,beR 

(0 o 0)(a + b) = 0[0(a + 6)] 

= 0[0(a) + 0(6)] 

= 0[0(*)] + 0[0(6)] 

= (0*0X«) +(0*0X6) 

(0 o 0)(a6) = 0[0(aZ>)] 

= 0[0(a)0(6)] 

= 0[0(a)] • 0[0(6)] 

= [(0o0)(a)][(0o0)(fc)] 


and 
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(iv) This is immediate from (/), («), and (Hi). 

(v) The basic facts about (j)~ i were discussed in 2-5-4. Let us recall some 
of them explicitly. Given s e S —because (f) is surjective there exists a e R for 
which (j)(a ) = s , and this element a is unique because (f) is injective. We have 
then, by definition (j)~ i (s) = a. Furthermore, as proved in 2-5-4, (j)~ i o<£: 
jR->Pis the identity map, (j) © 0 _1 : S -► S is the identity map, and 0" 1 : 
5 -► P is both surjective and injective. 

To prove is an isomorphism, it suffices to show that it is a homo¬ 
morphism. For this, suppose s u s 2 eS. We may write 0 -1 Csi) = a t e R, 
(j)~ i (s 2 ) — a 2 e R —or what amounts to the same thing, <f>(a x ) — s u <j)(a 2 ) = s 2 . 
Then 


0 -1 (5i + s 2 ) = 0 _1 (0(a!) + <f>(a 2 )) 
= + a 2 )) 

= (4>~ l ° <^)K + a 2 ) 
= ^ 

= + r\s 2 ) 

^ _1 (^2) = 0 _1 [0(«l)0(«2)] 

= <^- 1 [^(a 1 a2)] 

= (0 _1 o <t>)(a l a 2 ) 

= a x a 2 

= rKsxWKsj. 
This completes the proof. | 


(substitution) 

(<f) is a homomorphism) 
(definition of composite) 
((j)~ 1 o (j) = identity) 
(substitution), 


In virtue of the last result, there exists an isomorphism of R onto S if and 
only if there exists an isomorphism of S onto R ; and in such a situation there 
is nothing lost in saying that the rings R and S are isomorphic. 

We now have more than enough tools to prove a theorem which may be 
considered to be the focus or culmination of almost the entire development in 
this chapter. 


2-5-11. Theorem. Let { D , P} be a well ordered domain with identity 
e; then Z is order-isomorphic to D —more precisely, we can exhibit 
an isomorphism of Z onto D which preserves order. 


Proof : Of course, both Z and D are rings, and we shall first produce an 
isomorphism (j) of Z onto D. Then we shall prove that (f) preserves order (after 
explaining the meaning of the phrases “ order-isomorphism ” and “preserves 
order ”). 
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Let us define a map (j>: Z -► D by 

<j)(n) = ne, ne Z. 

The meaning or definition of ne in D , and the properties of such symbols were 
discussed in 2-4-6 and 2-4-7. Using these properties, we see that for m,ne Z 

(f)(m + n) — {m + n)e = me + ne — (f)(m) + (j){n), 

(f)(mri) = (mn)e = (me)(ne) = <fi(m)(j)(n). 

Consequently, 0 is a homomorphism. 

Since {D,P} is an ordered domain, we know from 2-3-6 that e > 0. 
Therefore, 2e = e -f e > 0, 3e = 2e + e > 0, and inductively me > 0 for all 
integers m > 0. To put it another way, for every integer m > 0 we have 
(j)(m) > 0 in {D, P}. Consider the set 

S = {me | m > 0} 

which may also be written as 

S = {4>{m) | m > 0}. 

In words, S is the image under (j) of the set of all positive integers. Clearly, 
S c: P and ee S. Moreover, the set S is inductive, since me e S implies 
me + e — (m + l)e e S . Consequently, because { D , P} is well ordered, 2-3-14 
tells us that S = P. 

Since P = {me | m > 0}, it now follows that 

— P = { — (me) | m > 0} = {(—m)e \ m > 0} 

= {me | m < 0} = {0(m) | m < 0}. 

If we let Z >0 denote the set of all positive integers and Z <0 denote the set of 
all negative integers, the preceding remarks yield 

0( z>o) = { 4>(m) I m > 0} = P, 

$(Z< 0 ) = < 0} = —P. 

In particular, for the homomorphism (j): Z->Dwe have 

m > 0 => <j)(m ) > 0, 

m — 0 ^> 4>(m) — 0, (**) 

m < 0 => (j>(m ) < 0. 

In words, (**) asserts that under 0, the image of a positive integer is a positive 
element of D, the image of zero is zero, and the image of a negative integer is a 
negative element of D ; even more, (*) asserts that under 0, the image of the 
set of all positive integers is the set of all positive elements of D , and the image 
of the set of all negative integers is the set of all negative elements of D. 
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With all the machinery in place, it is easy to show that 0 is both injective 
and surjective. In fact, if me ker 0, then 0(m) = 0, so according to (*) we 
have m — 0; hence, ker 0 = (0) and 0 is injective. Furthermore, according to 
the definition of ordered domain, D is a disjoint union D=?u{0}u — P; 
so in virtue of (*) we have 

Z) = 0(Z >o )u{0(O)}u0(Z <o ) 

which guarantees that every element of D is the image of some integer under 
0—that is, 0 is surjective. [The proof of surjectivity of 0 could have been 
expressed more directly in the form: Z = Z >0 u {0} u Z< 0 implies 

0(Z) = 0(Z >O ) u {0(0)} u 0(Z< o )=Pu {0} u -P= D\ 

Therefore, 0: Z -► D is an isomorphism. 

Thus, Z and D are essentially the same, as rings. But except for their ring 
structures both Z and Z> are well-ordered domains; so on each of them we 
have an order relation defined in terms of either the set of positive elements or 
the relation < (see Section 2-3). In order for Z and D to be essentially the 
same, as well-ordered domains, it is natural to require that the isomorphism 
0 (or some other isomorphism between Z and D) should also preserve order. 
The “preservation of order” may be defined in terms of the set of positive 
elements, but it is more convenient and intuitive to do so in terms of the rela¬ 
tion <. More precisely, an arbitrary mapping 0: Z -► Z> is said to preserve 
order when, for any integers m and n 

m<n=> (j)(m) < 0(w) 

—in other words, m is less than n (in Z) implies that 0(m) is less than 0(w) (in 
Z)). Then, by an order-isomorphism we mean an isomorphism which preserves 
order. Clearly, the notion of order-isomorphism is applicable for any two 
ordered, or well-ordered domains—and two such domains should be con¬ 
sidered to be the same if they are order-isomorphic. 

Returning to our isomorphism 0: n -» ne of Z onto Z), it is immediate 
from (**) that 0 preserves order—in fact, 

m < n^>m — n < 0 

=> 0(m — n) < 0 
=> 0(m) — 0(«) < 0 
=> 0(m) < 0 (h). 

Thus, 0 is indeed an order isomorphism, and the proof is complete. | 

According to this result, there is really only one well-ordered domain, 
namely Z! Any well-ordered domain D is an order-isomorphic “copy” of 
Z; they can differ only in the names of their elements, but their internal 
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structures are the same. Any statement about Z can be transferred to D , and 
conversely. The way to formulate this in mathematical language is to say that 
we have a characterization of Z, or, more precisely that the axioms for a 
well-ordered domain characterize Z. If these axioms hold, then we are dealing 
with some “ version ” of Z. 

Note that if D t and D 2 are any two well-ordered domains, then they are 
order-isomorphic. In fact, by 2-5-11, there exist order-preserving isomorphisms 
(f) l : Z -> D t and (j ) 2 : Z -► D 2 . It then follows easily that 0f 1 : D l -► Z is an 
order preserving isomorphism, and so is 

0 2°07 1: d 1 ->d 2 . 

Having developed a set of axioms which determine Z completely we could, 
in theory at least, go back and prove all the number-theoretic results of Chap¬ 
ter I in a formal rigorous way without knowing anything about Z other than 
the axioms for a well-ordered domain. 


2-5-12 / PROBLEMS 

1. Consider the sets X = {1, 2, 3} and Y = {1, 2, 3, 4}. 

(i) Define a mapping <j>: X-+ Y by putting 

0(1) = 3, 0(2) = (1), 0(3) = 4. 

Is 0 injective? Is it surjective? 

(//) Define a mapping 0: Y -> X by putting 

0(1) = 2, 0(2) = 3, 0(3) =1, 0(4) = 3. 

Is 0 injective ? Is it surjective ? 

(m) Compute the composite mappings 0 © 0 e Map(F, A") and 0° 0e 
Map( y, F). For each one, decide if it is injective and if it is surjective. 

2. In each of the following cases is the given map 0: Z-> Z injective? 
surjective? a homomorphism? an isomorphism? 

(/) 0(x) = x + 5, (ft) 0(x) = 3x — 2, 

(»0 0(*) = \x\ 9 (iv) 0(x) = -x, 

(i?) 0(x) = x 3 — x. 

Answer the same questions (if possible) when 0 is the mapping of Q -► Q 
defined by the given formulas, and also when 0 is viewed as a mapping of 

R R. 

3. Let m — 3. In each of the following cases, is the given map 0: Z w -> Z m 
injective ? surjective? a homomorphism? an isomorphism? 
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(o 0([*| m ) = 2^L > (») ^(IfU = +Ifil. 

(/») tfflxjj = -\xj m , (iv) <t>([x] m ) =(|^| m ) 2 , 

W •HlfJm) =(^J 3 - 

Answer the same questions for m = 5 and for m = 6. Investigate these 
questions when m is an arbitrary integer greater than 1. 

4. When there is a 1-1 correspondence between the sets F and F, let us 
denote this by X<-> Y. Show that the relation of 1-1 correspondence is 
reflexive, symmetric, and transitive. In more detail, for sets X , F, Z 

(0 F~F, 

(«) if F<-» F, then Y<-> X , 

(w) if F<-> Y and F<->Z, then F«->Z. 

5. Consider any sets A", F, Z, IF and any mappings /: X -► F, g\ Y-+Z, 
h: Z-+W; then we have mappings h ° (# o/) and (/* ° #) °/ in Map(F, IF). 
Prove that 


h ° (tf °/> = (h°g) of. 

Of course, this may be viewed as a sort of associative law for mappings. 

6. Consider the three element set X = {1, 2, 3}. 

(/) Find two mappings /: F-> X and g\ X-> X such that g °f ° g. 

(ii) Can you do this in such a way that both/and g are 1-1 onto? 

(iii) How many elements are there in Map (F, F)? How many of these 
are surjective maps ? 

7. (/) Exhibit two mappings /: Z -► Z and g: Z -► Z for which g °f = 1 
(the identity map of Z) and f ° g 1. 

(h) Can you do this in such a way that / is not surjective and g is not 
injective ? 

8 . Suppose R and S are rings and consider a mapping <j>: R-+ S. Now; (j) 
mjay be injective or it may not; (j) may be surjective or it may not; (j) may 
be a homomorphism or it may not. Thus, there are eight “types” for <£. 
Exhibit a 0 of each type; of course, you are free to choose R and S in 
each case. 

9. Let us fix elements a , b in the ring R 9 and let us define mappings (/> : R-+ R 
and ij/: R -► R by 

( p(x ) = a + x 9 = bx. 

Find the mappings xj/ © 0, (j> o \j/ 9 <£ o <£, ^ 

We now have a total of six mappings before us. Which ones are 
injective? surjective? homomorphisms? Can any of these properties be 
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affected by imposing additional conditions on R (for example, such 
restrictions as: R is commutative; R has a unity; R is a domain,... and 
so on) and selecting a and b judiciously? 

10. Given sets X , Y and mappings/: X -► F, g\ Y -► X, prove that \ig°f= 1, 
the identity map of X , then / is injective and g is surjective. 

11. Suppose a mapping/: XY is given. 

(/) Show that/is 1-1 and onto there exists a mapping g: Y-> X for 
which g °f— 1 and /© g = 1. As indicated in 2-5-4, when such a 
mapping g exists we say that “/has an inverse” or that “g is an 
inverse of/.” 

(ii) When the above situation holds, we have: 

(a) the inverse of/ is unique. 

(b) the mapping g has an inverse (as it too is 1-1 and onto), and its 
unique inverse is /. 

12. Suppose 0: R t -► R 2 is a surjective homomorphism (where R x and R 2 are 
rings). Show that if R t is commutative, then so is R 2 . 

13. The relation of isomorphism between rings is reflexive, symmetric and 
transitive. In more detail, when the rings R and R' are isomorphic, let 
us write R~ R'; then for rings R l9 R 2 , R 3 , prove 

(0 

(ii) if R t ~ R 2 , then R 2 ~ R u 

(iii) if R t ^ R 2 and R 2 ~ R 3 , then R x ~ R 3 . 

14. If the rings R and S are isomorphic, and S is an integral domain, show 
that R is an integral domain. 

15. Verify that the mapping 0: Z 24 -► Z 6 defined by 

0(|^24) -UL 

is a homomorphism. Is it surjective? What is the kernel? 

16. Suppose m and n are integers greater than 1 with n \ m. Find a “ natural ” 
homomorphism (/>: Z m -► Z„. Is it surjective ? What is the kernel ? What 
happens if njfml 

17. Suppose { R , +, •} is a ring with unity e. Let us define, for a, be R 

a Xb = a + b + e 9 a°b = a + b + ab. 

Prove that {jR, 1, ©} is a ring. Does it have a unity? Show that the 
mapping <f> \ a-+ a — e is an isomorphism of the ring {i£, +, •} onto the 
ring {i?, ±, o}. What is the inverse isomorphism? 
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18. For any rings R t and R 2 the direct sum rings R t ®R 2 and R 2 © R t are 
isomorphic. More generally, if R u R 2 ,..., R n are rings and o is a permu¬ 
tation of the set {1,2,..., «}, then the rings R t © R 2 © • • * © R n and 
R a i © R a2 © ... © R an are isomorphic. 

19. If (j) t : R -► R t and (j) 2 : R-+ R 2 are homomorphisms of rings, then the 
mapping 0 : R -> R t © R 2 defined by 

4>(a) = 0 2 («))> i? 

is also a homomorphism. 

20. Suppose (j) i : R i -^ S t and (f) 2 : R 2 -+ S 2 are homomorphisms of rings. 
Define 0 X © 0 2 : R t © R 2 -> ^ © S 2 by 

(<£i © 02X^1, a 2 ) = (0i(ai), 0 2 (a 2 )), a t eR u a 2 e R 2 

and show it is a homomorphism. Furthermore, if both 0i and (j) 2 are 
surjective, then so is 0 X © 0 2 ; and if both 0i and <j) 2 are injective, then 
so is © 02 > consequently, if both 0 X and 0 2 are isomorphisms, then 
so is 0i © 0 2 . 

21. Suppose Z>i, D 2 , Z> 3 are ordered domains and 0i: £>i -► D 2 , 0 2 : D 2 -► Z> 3 
are order-isomorphisms. Show that 07 1 and 0 2 o 0 1 are order-isomor¬ 
phisms. 

22. Are the rings Z[i] = {a + W | a, e Z} and Z [^/2] = {a + Tn/ 2 | 6 e Z} 

isomorphic ? 

23. (/) Consider the rings 7?i, 7? 2 and their direct sum 7?i © 7? 2 . As in 2-5-7, 

part (7), define mappings 

/i: © i? 2 , f 2 : R 2 -+ R^@ R 2 , 

g t : R t ® R 2 —> 7?i, g 2 \ R 1 © R 2 —> 7? 2 > 


by putting 

/i( fl i) == ( a u 0)> 

gi(a u a 2 ) = a i9 


f 2 (a 2 ) = (0, a 2 ), 
giig i? ^2) = ^2 > 


^1 G 7^1, #2 ^ 7^2 • 


Then, /i and f 2 are injective homomorphisms, and g t and # 2 are 
surjective homomorphisms. 
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By taking composites we obtain the following homomorphisms 


0i °/i : Ri->R U 
0i °fi : R2~*Rl> 
fi ° ffi- Ri © -^2 Ri © R 2 » 

and, in fact, 


02 °fl- Rl^R 2 , 

02 °/*2 • R 2 R 2 > 

f2° 9 2 • ^1 © ^2 ~* ^1 © -^2 » 

02 0 /2 = U 

02 °/l =0, 


01 °/i = 1, 

01 0/2 = 0 , 

/l °01 +f 2 °02 = 1- 


[Here, as discussed in 2-2-12, 0 refers to the zero map, and + denotes 
the sum of two functions in Map^ ® R 2 , Ri ® i? 2 )- Of course, the 
composition © of two functions has no connection with the product • 
of two functions as discussed in 2-2-12.] 

(//) More generally, consider the direct sum of rings, R t ® R 2 ® • • • ® R n . 
For each i = 1, 2,..., n define mappings 

fi : -^1 © ’ ’ ’ © R n , 9 i : ^1 © ’ ’ • ® R n &i 


by putting, for a x e R u a 2 e R 2 ,..., a n e R n 


fifai) . . . , 0 , Cl 1 , 0 , . . . , 0 ), 9i(9 1 j ^2 > • • • > &n) — • 


Then each is an injective homomorphism, known as the injection on 
the /th coordinate; and each g t is a surjective homomorphism, known 
as the projection on the /th coordinate. 

For each pair of integers ij with 1 <ij <n we have a homo¬ 
morphism 


—in fact, 


9i °fj : &j 



if / =j, 
if / ¥>j. 


Furthermore, for each / we have a homomorphism 


Ji° 9i' Ri © * * ’ © R n R-i © ’ ’ * ® R n 


and these satisfy the relation 

/1 0 9\ +fi 0 9i + * • • +fn 0 9n = 1 
(the identity map of R t ® • • • ® R n ). 
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Miscellaneous Problems 

1. In Z, let us write 

ajb = max {a, b} 9 alb — min{a, b }. 

Is { Z, T 5 -L} a ring? Which axioms are satisfied and which are not? 

2. (i) Show that a nonempty subset S of Z that is closed under subtraction 

is a subring of Z; so there exists (see Problem 11 of 2-2-15) an 
integer d > 0 for which S = dZ = {nd\ n e Z}. 

(ii) Show by example that a nonempty subset of Z that is closed under 
addition need not be a subring. 

3. Consider the set # of all continuous real-valued functions on the closed 
interval [0, 1]. Show it is a commutative ring with unity. Is it an integral 
domain? If we write X = [0, 1] how does # compare with the ring 
Map(Jf, R)? Show that the set of all differentiable functions on [0, 1] is a 
subring of 

4. Suppose/and g are two real-valued functions (on [0, 1], say) that have as 
many derivatives as desired. For any n > 1, find a formula for the nth 
derivative of the product fg\ prove the formula by induction. 

5. Which of the following subsets of Map(R, R) is a commutative ring with 
unity ? 

(0 {/ /(l) = 0}; 

(n) {//(1)#0}; 

(Hi) {//(l)=/(0)}; 

(iv) {/ |/(1)| <M }, where M is a fixed real number; 

(v) {/ |/(*)| <M for all x e R}, M fixed. 

6. Show that in the definition of a ring, Axioms A3 and A4 may be replaced 
by the statement: For any a, be R, there exists an element x e R such 
that a + x = b. 

7. Suppose {jR, +, •} satisfies all the requirements for a ring with unity e 
except the commutative law for addition. By expanding {a + b)(e + e\ 
show that addition is commutative, so {jR, +, •} is indeed a ring. 

8. In an ordered domain, the cancellation law for multiplication can be 
proved from the other assumptions. In other words, show that if R is a 
commutative ring with unity in which the axioms for an ordering (as in 
2-3-1) hold, then R is a domain. 

9. Show that Z p —or, more generally, any finite integral domain—cannot be 
made into an ordered domain. 
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10. Suppose R satisfies all the axioms for a ring except the commutative law 
for addition. Prove that if there exists c e R for which ca = cb=> a = b, 
then R is a ring. 

11. Suppose the ring R has a unique element e such that ea = a for all a e R 
(in other words, e is a unique left identity); show that e is an identity for 
multiplication. 

12. If A y B, C are subsets of the universal set X , prove the following: 

(0 A =0 o B = (A n B c ) u (A c n B); 

0 a) A n B = A - (A - B) = B - (B - A) = (A v B) - (A + B); 

(iii) A - C c (A - B) u (B - C); 

(iv) (Au B)-(BnC) = (A-B)u(B- C); 

(v) (A - B) n C = A n C - B n C; 

(vi) An(B+C) = AnB + AnC. 

13. Suppose Ay By C are elements of X). Show that if A u B = A u C and 
A n B = A n C then B — C . 

14. (/) If A and B are finite sets prove that 

#(AvB) = #(A) + # (B) - #(A n B). 

(ii) If A u A 2 , A 3 are finite sets prove that 

# (A 1 u A 2 u A 3 ) = # (A t ) + # (A 2 ) + # (^ 3 ) 

-#(A 1 n A 2 ) -#(A 1 n A 3 ) 

-#(A 2 n A 3 ) + # (A 1 n A 2 n A 3 ). 

(iii) Generalize the preceding to the case of n finite sets A u A 2 ,..., A n . 

15. (0 If A and B are finite sets then prove that 

#(A + B) = #(A) + #(B)- 2#(A n B) 

(ii) If A u A 2 y A 3 are finite sets, find an expression for #(A i + A 2 + A 3 ). 

(iii) Can you generalize the preceding to the case of n finite sets 
A u A 2 y ..., A„? 

16..Suppose {^ a |ae^l} is an arbitrary collection, finite or infinite, of sub¬ 
sets A a of the set X. (Here, 21 is simply the set of all indices.) Prove the 
generalized de Morgan laws: 

(o4=u^ (y^) c= o^ 
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17. A ring R is said to be Boolean when a 2 = a for every ae R. 

(i) Give an example of a Boolean ring. 

(ii) Prove: If R is Boolean then 2a = 0 for every ae R. 

(iii) Show that a Boolean ring is commutative. 

(iv) Prove: If R is a Boolean ring with identity e then (excluding 0 and e) 
every element of R is a zero divisor. 

(i?) Prove: If R and S are Boolean rings, so is their direct sum R® S. 
What about the converse ? 

18. In the commutative ring R , suppose a and b are nilpotent elements (see 
2-4-10, Problem 13), and c is arbitrary. Show that a + b 9 a — b and ca 
are nilpotent. 

19. Give examples of a ring R and a subring S such that 

(i) R and S have the same unity. 

(ii) R and S have different unities. 

(iii) R has unity but S does not. 

(iv) S has unity but R does not. 

(v) S is commutative but R is not. 

(vi) R has unity and is not a domain, and S is a domain. Do R and S 
have the same unity ? 

20. Give examples of rings R and S and a surjective homomorphism (/>: R-> S 
such that 

(0 S is commutative but R is not. 

(ii) S has a unity but R does not. 

(iii) S is a domain but R is not. 

(iv) R is a domain but S is not. 

2L Suppose R is a ring and S is a set on which we have two operations 4* 
and *. Show that if <f >: R -» S is a surjective map that preserves the opera¬ 
tions (that is, it satisfies the requirements for a homomorphism) then S is 
a ring. One says: a homomorphic image of a ring is a ring. 

22. Suppose 0 is a 1-1 mapping of the set S onto the ring R = (i£, +, •}. 
Show that S becomes a ring, isomorphic to R , when its operations are 
defined by: 

a ® b = + <}>{b)) 

aQb = <£ -1 (<Ka) • <£(£)) 

What if 0 is given as a map of S onto R ? 

23. Suppose R is a ring (not necessarily commutative) with unity e . We say 
that the element ae R has a right inverse when there exists b e R for 



202 


II. RINGS AND DOMAINS 


which ab = e , and that ae R has a left inverse when there exists c e R for 
which ca = e. 

(, i ) If b is a right inverse of a , and c is a left inverse of a , show that 
b — c. We then have ab = ba = e , and b is said to be an inverse 
of a. 

(ii) If a has an inverse, it is unique—denote it by a~ l . Show that a~ i 
also has an inverse, namely a. 

(in) Prove: If both a and b have inverses, then so does ab —in fact, 
(ab) ~ 1 = b~ i a~ i . 

(iv) Give an example of a ring R and two elements, a, be R such that 
a and b have inverses but (ab)~ x ^ a~ i b~ l . 

24. Consider A = (“ *) in J(( Z, 2); then prove the following: 

(/) There exists B e Ji( Z, 2) for which AB = / (that is, A has a right 
inverse) if and only if ad — be = ±1. 

(ii) There exists B e Ji( Z, 2) for which AB — I if and only if there exists 
C e Ji( Z, 2) for which CA — I. That is, A has a right inverse if and 
only if A has a left inverse. 

(iii) If AB = / and 04 = / then B = C. 

(it;) If = / then = / (that is, if v4 has a right inverse then it has an 
inverse) and conversely. 

(v) If there exists B such that AB = /, then B is unique. 

25. Consider A = (“ £) in ^#(Q, 2) [or in ^#(R, 2)]. Then there exists 
B e Jt( Q, 2) [or in J(( R, 2)] for which AB — lo ad — be ^ 0. Discuss 
the remaining assertions of the preceding problem in this situation. 

26. Let R be a ring with unity e. Suppose ae R has a right inverse. Then 
show that the following are equivalent: 

(i) The right inverse-of a is unique; 

(ii) a has an inverse; 

(iii) a is not a left zero-divisor (that is, ab = 0 <=> b = 0). 

27. Consider a ring R. If m is the smallest positive integer such that ma = 0 
for all a e R, we say that R has characteristic m (in particular, R is said to 
have finite characteristic) and write char R = m. If no such m exists (so 
there is no n > 0 such that na = 0 for all ae R) we say that R has 
characteristic 0. 

Let D be an integral domain; then 

(/) Prove: If there exist n > 0 in Z and b # 0 in D such that nb — 0 then 
na = 0 for all a e D ; so D has finite characteristic. 

(ii) Prove: If D has finite characteristic then char D is the smallest 
positive integer d such that db = 0 for some b # 0 in D. 

(iii) Prove: char D is 0 or a prime. 
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(iv) For an arbitrary m> 0, can you produce a ring R of characteristic 
ml 

28. Consider a ring R without unity, and the product set R x Z—which we 
denote by R. If we define operations in R by: 

(tf, m) + (6, n) = (a + b, m + «), 

(a, m) • ( b , n) = (ab + na + mb , m«), 

then R becomes a ring with unity (0, 1). What is the characteristic of R1 
The mapping a -> (a, 0) is an injective homomorphism of R into R ; thus 
R contains an isomorphic copy of R as a subring, and R has been 
“imbedded” in a ring with unity. 

Explain the reasons underlying the choice of R = R x Z and the 
definition of its operations. Can you imbed the given ring R in a ring 
with unity whose characteristic is the same as the characteristic of R1 

29. If R is a ring with unity of characteristic 0 show that R contains a subring 
that is isomorphic to Z. 

30. If a is an odd integer, prove that a 2 = l(mod 8 ), a 4 = l(mod 16) and, by 
induction, 

a 2 " = 1 (mod 2 n+2 ). 

31. (/) Show that if p is prime then for any positive r and «, 

a = b (mod f) => a pr = b pr (mod p n+r ). 

(ii) Conversely, show that if the prime p is odd and p)( a then 
a pr = b pr (mod p n+r ) =>a = b (mod p ). 

32. (i) Why is the relation \a + b\ < \a\ + |&| in an ordered domain known 

as the triangle inequality ? 

(ii) If a u a 2 , ..., a n are elements of an ordered domain prove, by induc¬ 
tion, that 

i>. * i n- 

i= 1 i= 1 

(iii) Decide whether the following formula holds for elements of an ordered 
domain 


( a lbl + # 2^2 + • • ’ + a nK) 2 ^ ( a l 2 + • • • + a n 2 )(bl 2 + * * * +b„ 2 ). 



204 


II. RINGS AND DOMAINS 


33. As usual, [< a , b] represents the closed interval {x e R|<z <*<&}; 
prove that 

lx — a\ lx — b\ 

x * \jb^a) d+ wrbj c 

is a 1-1 mapping of [a, b ] onto [c, d\. 

34. Consider sets X , Y and a mapping/: XY; then: 

(/) Show that/is injective if and only if there exists g: Y -> X such that 

0°f = 1 - 

(w) Show that/is.surjective if and only if there exists h: Y -> X such that 
f°h=l. 

( Hi ) Show that if g: Y -> X and h : Y-> X are such that g °f = 1 and 
f o h = 1 then g = h. 

( iv) Show that / is injective and surjective if and only if there exists 
g: Y -> X such that g °f = 1 and f ^ g — 1 . 

35. Define a function/ recursively, on all positive integers as follows: 

/( 1 )= 1 , /( W +l) = 2/(«)+l. 

Find the value of/(«). How is this related to Problem 24 of 2-3-20? 

36. Use the fact that a set with n elements has exactly (£) subsets with k 
elements to prove the relation 

for binomial coefficients (which was mentioned in the proof of 2-4-8). 
How is this relation connected with Pascal’s triangle? 


37. Given a set M with m elements and a nonempty subset M Y . 

(/) Show that the number of subsets of M with an odd number of ele¬ 
ments is 2 m “ 1 , and so is the number of subsets with an even number 
of elements. 

(«) Show that the number of subsets of M containing an odd number of 
elements from M x is 2 m_1 . 


38. Prove the following properties of binomial coefficients: 




(m) YK-lf (”)=0, (/,) i(-l) k h l n \ = 0 . 


<•> ,s(;)(,/-Hr)- 


n> r, 
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39. Let o(n, k) = l fc + + • • • + n k . By induction on n (keeping k fixed) prove 

that 

t ^ ^ + 2 ^ a ( n ’ + " ' + fc a ^ 

= (n + l)* 1 — («+ 1). 


Use this formula to compute <r(«, i) for i = 1, 2, 3, 4. 

40. Suppose the three conditions of (2-3-4) are satisfied in the domain D — 
namely: 

(i) a > 0, b > 0 => a + b > 0, 

(ii) a > 0, b > 0 => ab > 0. 

(iii) For any ae D exactly one of a > 0, a — 0, a < 0 holds. 

Let P = {a e D | a > 0}. Is {D, P} then an ordered domain ? 

41. Can 2-3-16 be proved by considering the set 

5={fle P| n(b) is true for all b < a} ? 

42. Suppose { D , P} is an ordered domain. We have seen in 2-3-14 that if 
{A P} is well ordered then mathematical induction holds. Prove the 
converse. To put it another way, show that the following conditions are 
equivalent in {D,P}: 

( i ) Any inductive set of positive elements that contains the identity e 
must be P itself. 

(ii) Every nonempty subset of P has a smallest element. 

43. Show that Condition (i) of Problem 42 may be replaced by our other 
formulations of mathematical induction—namely, those of 2-3-15 and 
2-3-16. 


44. In Z 6 show that R t = (0, 2, 4} and R 2 = {0, 3} are subrings with 
R x nR 2 = ((^andA + R 2 = Z 6 .[Recallthat R t + R 2 = {a i + a 2 \a i eR i .] 
One then says that Z 6 is the (internal) direct sum of the rings R t and 
R 2 . Prove that every element a e Z 6 can be written uniquely in the form 
a = a t + a 2 , a t e R u a 2 e R 2 , 

Express Z 10 and Z 12 as internal direct sums of subrings. What about 
Z m in general ? 

45. Can you find an injective homomorphism of Z m into Z„ when m | n —in 
other words, does Z n contain an isomorphic copy of Z m ? What if 
(m, n/m) = 1 ? 
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46. Consider the rings 

(0 Z 3 , («) Z 4 , (iii) Z 5 , (&) Z p , (®) Z m . 

In each case, find all subrings of the given ring. What are their character¬ 
istics? In each case, find as many automorphisms of the given ring as you 
can. 

47. Consider the rings 

(0 Z, (//) Q, (iii) R. 

In each case show that the only automorphism of the given ring is the 
identity. 

48. (/) If m is a square free integer show that Z [ y/m] = {a + byjm\a, be Z} 

is an integral domain. How many automorphisms can you find? 

(ii) If, in addition, m= l(mod 4) and we put co = (1 +y/m)/2 then 
Z[c o] = {a + bco \ a, b e Z} is an integral domain. How many auto¬ 
morphisms can you find ? 

49. Prove that if (j) is an endomorphism of the ring R (meaning: a homo¬ 
morphism of R into itself), then S = {a e R \ <f)(a ) = a) is a subring of R . 

50. Show that all 2 x 2 matrices of form (g “*) with a, be Z constitute an 
integral domain which is isomorphic to the domain of Gaussian integers, 

m 

51. Suppose R is a ring with unity e and n is an integer greater than one. 
Consider Jt(R, n ), the set of all n x n matrices with entries from R. We 
write a typical matrix of Ji(R, n) in the abbrieviated form {a i3 )—which 
signifies that the entry of the matrix in the i 9 j place is the element a tj of R . 
Define addition in Ji{R y n) by 

(ay) + (b,j) = (c tJ ), where c y = a u + b ip ij = 1, 

Define multiplication in Jf(R , n) by 

n 

(fly) • (fey) = (Cy) , Wlld'C Cy = £ Cl ik b k j , I, j = 1, . . . , « . 

k= 1 

Verify that Jt(R, n) is a ring with unity. The unity element / has e along 
the main diagonal and 0 elsewhere— 
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Define also multiplication of a matrix by an element of R (called 
“multiplication by a scalar”) by 

c(a u ) = ( ca t j ), ceR. 

In other words, every entry of the matrix is multiplied by c. 

Show that the following are subrings of Ji(R, ri). 

(i) {cl\ c e R} ; these are known as the scalar matrices—each one is of 
form ( a tj ), where a n = a 22 = • • • = a nn = c and a i} = 0 for i ±j. 

(ii) The set of diagonal matrices; these are of form ( a t j ), where a u = 0 
for i # j —in other words, all entries off the main diagonal are 0. 

(iii) The (upper) triangular matrices; these are of form (a u ) where 
a tj = 0 for i < j —in other words, all entries “ above ” the main 
diagonal are 0. 

(iv) The strictly triangular matrices; these are of form (a t j) where a tj = 0 
for i < j —in other words, all entries above and on the main diagonal 
are 0. 

Fix an integer r, 1 < r < n. Is the set of matrices that are 0 outside the 
rth row—that is, {(tf^*) | a tj = 0 when i # j }—a subring? What about 

{( a o) I a ij = 0 when i>r}? 

52. (/) Show that J7(R y n) always has zero divisors; in more detail, exhibit 

A, Be Ji(R, n) such that AB = BA = 0. 

(ii) Show that if there exist a, be R with ab # 0, then Jt(R , n) is not 
commutative. 

53. Show that if 0: R -► R' is a homomorphism of rings, then the mapping 
(a^) -► (0(«ij)) is a homomorphism of J7(R, n) -► J7(R\ ri). 

54. For each pair i 9 j with 1 <ij <n let E u denote the matrix with e in the 
ij place and 0 elsewhere. Then show that any matrix of J7(R, ri) has a 
unique expression of form 

n 

X] ^ij^ij’ &ij ^ R • 

i,J=l 

Show also that the s multiply according to the rule 

F F f° if s ^ u, 

n w \E rv if s = u. 

55. Consider the real quaternions Q. These are all the expressions a = 
a 0 + a t i + a 2 j + a 3 k where a 0i a u a 2 , a 3 e R and /, y, k are formal 
symbols. If also P = b 0 + b x i + b 2 j + b 3 ke Q then it is understood that 

oc = p o a 0 = b 0 , ai = a 2 = b 2 , a 3 = b 3 . 
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We define addition and multiplication of quaternions by: 

a + P = (a 0 + b 0 ) + (#i + bi)i + (< ci 2 + b^j + (a 3 + b^)k, 

and 


ap = (a 0 b 0 - a i b i - a 2 b 2 - a 3 b 3 ) + (a 0 b x + a t b 0 + a 2 b 3 - a 3 b 2 )i 

+ fao b 2 - a t b 3 +a 2 b 0 + a 3 bjj + (a 0 b 3 + a t b 2 - a 2 b t + a 3 b 0 )k. 

Multiplication is not really mysterious. One wants to have 
i 2 =j 2 = k 2 = -1, ij= -ji = fc, jk = -kj = i, ki = - ik =j, 


and, assuming these rules, when a and P are multiplied according to the 
“ usual rules ” the result is the given expression for a/?. 

Then Q is a noncommutative ring with zero-element 0 = 0 + 0/ + 0 j 
+ 0A: and unity 1 = 1+0/ + 0/ + 0A:. Moreover, every a ^ 0 in Q has an 
inverse; in fact, writing a 0 2 + a 2 + a 2 + a 3 2 = d (so d # 0) and P = 
(( a 0 /d ) — (ajd)i — {a 2 jd)j — (< a 3 /d)k , we have a/? = )3a = 1. 

Prove that Q is isomorphic to the ring of quaternions consisting of all 


2x2 matrices 



with a, p e C as defined in 2-2-15, Problem 29. 


56. By an arithmetic function or a number-theoretic function we mean an 
element of Map(Z >0 ,Z), where Z >0 denotes the set of all positive 
integers. An arithmetic function / is said to be multiplicative if f(mn) — 
f{m)f(ri) whenever m and n are relatively prime. Then: 

(/) If/e Map(Z >0 , Z) is multiplicative and g e Map(Z >0 , Z) is 
defined by 

9(n) = X f(d)> 

d J n 

then g is multiplicative. 

(//) Define g e Map( Z >0 , Z) by 

|1 if n = 1, 

j u(n) = J 0 if some square divides «, 

((— l) r if n is the product of r distinct primes. 

Then ju, which is known as the Mobius function, is multiplicative and 
it satisfies 




1 

0 


if n = 1, 
if n > 1. 


(///) If/e Map(Z >0 , Z) and g is defined by 

g(n)= X ftd), 

d | rt 


(*) 
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then 


f{n) = X Kd)g (-') = 9(d ). (**) 

d | n \aj d | n \dj 

This is known as the Mobius inversion formula. Conversely, if 
g e Map( Z >0 , Z) is given and/is defined by (**) then the formula 
(*) holds. 

(iv) If g(n) = 'L dl „f(d) and g is multiplicative, then so is/. 

(i?) How are the preceding results affected if we work in Map( Z >0 , R) ? 

57. Prove the following: 

(0 E ^(d) = |M«)| 

d2 | n 

(ii) If / is a multiplicative arithmetic function then 

E mm = n (1 -m ), V prime. 

d | n p\n 

58. Let D = Map( Z >0 , Z), the set of all arithmetic functions. Define addition 
+ and convolution * in D by 

(/+ g) («)=/(«) + 9(n), 


(f*g)(n) = Tf(d)g(^ ■ 


Show that {D, +,*} is an integral domain—the zero element, 0, is given by 
0(«) = 0 for all n e Z >0 , 
and the identity for multiplication, e , is given by 


e(n) 


"{o 


if n =1, 
if n > 1. 



CONGRUENCES AND POLYNOMIALS 


The problem of solving equations of form 

a n y? + tf*-!*" -1 + ••• + a t x + a 0 = 0 

is a familiar one. In high school, one is concerned primarily with the case 
where the “coefficients” a 09 a l9 are real numbers, and one usually 

seeks a real value for the “ unknown ” x. Of course, the same kind of problem 
can be formulated for any commutative ring R ; namely, if a 0 , a i9 ..., a n all 
belong to R 9 is there a value in R for the unknown x for which the equation is 
satisfied? This question is, however, much too difficult, and we shall only 
attack small parts of it. 

In this chapter, our approach will be on two distinct levels. On the abstract 
theoretical level we shall set up precise terminology about such things as 
polynomials, solutions or roots of polynomial equations, and factorization 
of polynomials, and we shall prove some basic results about them in the 
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case of an arbitrary commutative ring with unity R. On the concrete practical 
level we shall deal with a specific kind of ring R , the ring of residue classes 
Z m , and learn the theory and practice of solving polynomial equations with 
coefficients in Z m . 


3-1. Linear Congruences 

Let us fix m > 1 and write Z m = {a, j8, y, S,...}. In order to study 
equations of form 

a„x" + a„_ 1 x"~ 1 + ••• + ajx + a 0 = 0, a 0 , a 1; ..., a„ e Z m , 

it is appropriate to begin with the simplest case, namely, when n = 1. Thus, 
we deal first with the equation a t x + oc 0 = 0, which by a trivial change of 
notation we prefer to write as the linear equation 

ax = /?, a, e Z m , a # 0. 

By a solution of this equation, we mean an element y e Z m such that ay = /?. 
In common language, one might then say that “ y is a value of x that satisfies 
the equation.” 

Equations of this type are unfamiliar, so let us try to develop a feeling for 
them by looking at a few examples. 

3-1-1. Examples. (/) Consider m = 7, a = [_6j 7 e Z 7 , /? = |_2_| 7 e Z 7 . We 
want to solve the equation 

l6j 7 -x=[2j 7 . 

A solution must come from Z 7 , so there are exactly seven possibilities: |_0j 7 , 

| 11 7 , 1 2 | 7 ? | 3 | 7 , | 4 | 7 , | 5 | 7 , | 6 | 7 . Using the rules for computation in Z 7 
(which were discussed in Section 1-7) we find, by trial and error, that y = |_5j 7 
is a solution because [_6j 7 • |_5j 7 = |_2_| 7 , and in fact it is the only solution of 
the given equation. 

(«) Consider m = 10, a = |_^J 10 e Z 10 , /? = |_3j 10 e Z 10 ; which means that 
our equation is 

l«J„ 

By trying each of the 10 elements of Z 10 , one discovers that this equation has 
no solutions. 

(iii) Consider the equation 

I*]l5-*=l£]l5’ 

which is the case m = 15, a = ] 6 | t 5 e Z 15 , P — |_9_|l5 e Z 15 


. There are 15 
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possible solutions, and by testing all the possibilities it is easy to see that there 
are exactly three solutions—namely, (_4j 15 , |_9j 15 , [ 141 15 . 

(iv) Consider the equation 

It is straightforward to verify that there are no solutions in Z 15 . 

These examples indicate that peculiar or surprising things can happen. The 
linear equation ax = p in Z m may have no solutions, or one solution, or 
several solutions. Furthermore, the trial-and-error method used to solve these 
equations is clearly unsatisfactory. It is feasible only when m is small. Surely, 
we need more powerful techniques. 

Returning to our general linear equation ax = p in Z m , let us write 
a = L^J m 9 P = |AU where a, be Z. We may also write an arbitrary element 
ye Z in the form y = [_cj m where c e Z. (As a rule, elements of Z will be 
denoted by a , b, c,d ,....) Now, by making use of elementary facts about 
congruence classes (from Section 1-7) we have 

y = [cj m e Z m is a solution of ax = P 
o ay = P 

<=> ac = b (mod m) 

o c e Z is a solution of the congruence ax = b (mod m). 

Naturally, an equation of form ax = b (mod m) is known as a linear congru¬ 
ence, and by a solution of such an equation we mean (as has already been 
indicated) an integer c for which ac = b (mod m). 

Note that the role played by the symbol x depends on the context; in 
ax = p the 66 values ” for x are to come from Z m , while in ax = b (mod m) the 
66 values ” for x are to come from Z. Of course, the important thing is to 
understand the connection or relation between the solutions of these two 
kinds of equations. This connection was proved above, so we merely state it 
formally. 


3-1-2. Proposition. Suppose a, j Be Z m and a, be Z are any representatives 
for a and /?, respectively—that is, a = |_a_| m , P = j_^J m . Then y = [_cj m e Z m 
is a solution of the linear equation ax = p o ce Zisa solution of the linear 
congruence ax ~ b (mod m). 
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According to this result, in order to solve a linear equation in Z m we need 
only solve a linear congruence (which involves integers rather than residue 
classes). For example, to solve the equations discussed in 3-1-1, we may 
“ replace ” them by the linear congruences 6x = 2 (mod 7), 6x = 3 (mod 10), 
6x = 9 (mod 15), 6x = 8 (mod 15), respectively. Of course, other replacements 
are possible. For example, instead of replacing the linear equation [ 61 15 x = 
[_9_| 15 of case (Hi) of 3-1-1 by 6x = 9 (mod 15), it could also be replaced by 
2lx = 24 (mod 15), or —9x = 69 (mod 15), or 81x = —21 (mod 15), and so 
on—because, as stated in 3-1-2, any representatives a of |_6j 15 and b of | 9 | 15 
may be used. In this connection, it is useful to point out explicitly how the 
solutions for the many possible replacements of a linear equation ax = /? in 
Z m by congruences of form ax = b (mod m) are related; namely, 


3-1-3. Proposition. Suppose a, a\ b , V are integers with a = a' (mod m) 
and b = b' (mod m). Then the solutions of the linear congruence ax = 
b (mod rri) are identical with the solutions of the linear congruence 
a'x = b' (mod m). 


Proof: The hypothesis says (in virtue of our knowledge of Section 1-7) 
that and \_b\ m = . Call these a and /?, respectively. Conse¬ 

quently, by applying 3-1-2 twice, we have: ce Z is a solution of ax = b (mod m) 
o y = | c | m e Z w is a solution of ax = /? <=> c e Z is a solution of a'x = 
b' (mod m). This does it. | 

Incidentally, the reader should find it easy to prove this fact directly from 
the properties of congruence, without any recourse to residue classes. 

This result, 3-1-3, tells us, for example, that to solve the congruence 

3155x = 7847 (mod 18) is the same as solving the somewhat simpler congru¬ 

ence 5x = — 1 (mod 18) [since 3155 = 5 (mod 18) and 7847 = — 1 (mod 18)]. 

Now, as the next stage of our discussion, we turn to the problem of solving 
the general linear congruence 

ax = b (mod m), a, be Z. 

Consider an integer x 0 ; then 

x 0 is a solution of ax = b (mod m) 

<=> ax 0 = b (mod m) 
o m\(b — ax 0 ) 

o there exists y 0 e Z such that my 0 — b — ax 0 

o there exists y 0 e Z such that ax 0 + my 0 = b. 
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This proves: 

3-1-4. Proposition. The linear congruence ax = b (mod m) has a solution 
x 0 o the linear diophantine equation ax + my — b has a solution 

{*o > y<>}- 

This result “transforms” a linear congruence into a linear diophantine 
equation. Note that the same x 0 appears on both sides of the <=> sign. In 
particular, because {x 0 = 5,y 0 = —4} is a solution of llx-F 13j> = 3, it 
follows that x 0 = 5 is a solution of 1 lx = 3 (mod 13). 

Because we learned long ago (in Section 1-6) how to solve linear dio¬ 
phantine equations completely, we are now in a position to settle all questions 
facing us. However, one preliminary remark is needed. 

Suppose ce Z is a solution of ax = b (mod m). If d = c (mod m) (that is, 
if d e |_£_| w ), then ad = ac = b (mod m), so c'e Z is also a solution. It is 
customary to group all the elements of \_cj m together and say that [c_\ m is a 
solution (or “ one solution ”) of ax = b (mod m). We shall often do so, even 
though it is not quite accurate. The object |_£_| m , being an element of Z m , 
cannot possibly be a solution of ax = b (mod m); however, [c_\ m is indeed a 
solution of the corresponding equation ax = j8 with a = |_aj w , /? = [b_\ m in 
Z m . A phrase like “ [_c_| m is a solution of ax = b (mod m) ” will be, for us, 
simply a convenient way to say that every integer belonging to |_£_] m is a 
solution. 

3-1-5. Theorem. The linear congruence ax = b (mod m) has a solution 
( a , m) divides b . 

Furthermore, if x 0 e Z is any solution of ax = b (mod m) and we write 
d = (a, m ), m = m'd , then all the solutions are of form 

x = x 0 + tm\ t = 0, +1, +2,_ 

This infinite collection of solutions breaks up into d distinct residue classes 

[fojm’ l*o +w 'L > |*o+2m'L ,..., |*o + (</—l)rn'| m , 

so we say that there are exactly d solutions of ax = b (mod m). To put it 
another way, the equation ax = /? where a = [aj m , /? = [b_\ m has, in Z m , 
the d solutions: 

{ | x 0 + sm' | m | J = 0, 1, 1}. 

Proof : We have m > 1, and there is no harm in assuming a ^ 0, as the 
case a = 0 is surely not worth considering. According to 3-1-4 and 1-6-1, 
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ax ~ b (mod m) has a solution <=> ax -F my = b has a solution <=> {a, m) 
divides b. This proves the first part. 

If x 0 is a solution of ax = b (mod m ), then, by 3-1-4, there is associated 
with it a solution {x 0 , y 0 } of the diophantine equation ax -F my = b. So 1-6-1 
(with an obvious modification of notation) tells us that all solutions of 
ax -F my = b are given by 

s=x ° +, (?)’ 

t = 0, +1, ±2,. . .. 

= y - - ‘ (i) • 

By another application of 3-1-4, it follows that all solutions of ax = b (mod m) 
are given by 

x = x 0 -F tm\ t = 0, +1, +2,_ 

This shows, in particular, that as soon as a single solution of the congruence 
ax = b (mod m) is known, we can immediately produce all solutions, and 
there are an infinite number of them. 

Now, from the set of all solutions let us select d of them—namely, let us 
take 

x 0 , x 0 + m', x Q + 2m', ..., x 0 + (d — 1 )m'. 

We assert, first of all, that no two of these are congruent modulo m. To see 
this, consider any two of them 

x 0 + im! and x 0 + jm' 

where, of course, 0 < / < d — 1 and 0 <j < d — 1. Then 

x 0 + im ' = x 0 + jin' (mod m) => (/ — j)m = 0 (mod m) 

->m\(i-j)m f 
=> md | (/ — j)m f 
=>d\(i-j). 

Because of the constraints on i and j\ this implies i = j —so, indeed, our d 
solutions are incongruent to each other modulo m. This also guarantees (by 
1-7-11) that the d residue classes 

Ifojm’ K+ m 'b 1^0 + 2m ’ 

are distinct (and disjoint). 

Furthermore, we assert that every solution is congruent (mod m) to one 
of our d selected solutions. To see this, consider any solution x 0 + tm'. By 
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the division algorithm, we may write t = qd + r with 0 < r < d. Then 

x 0 + trri — x 0 + qm + rm ~ x 0 + rm (mod m) 

while x 0 + rm' is clearly an element of our set {x 0 + srri | .y = 0, 1,..., d — 1} 
of d selected solutions. We see, therefore, that any solution x 0 + tm ' belongs 
to one of the residue classes | xp+sm' | m , s = 0, 1, 2,..., d — 1. On the other 
hand, if an integer w belongs to one of these d residue classes, say we 
| xp+im' | m where 0 < i < d — l,then w = x 0 -F im'(modm) and it can be written 
as w = x 0 + imt + km — x 0 + (/ + kd)m\ which is of form x 0 + tm/ It 
follows that the set of all solutions decomposes into the d distinct residue 
classes | x 0 +sm' \ m9 s = 0, 1, ..., d — 1. Symbolically, we have 

{*0 + tm’\te Z} = [xo_| m u |x 0 + m’ | m u • • • u | x 0 + (d - 1 )rri | m 

where the union on the right is a disjoint union. 

Finally, according to 3-1-2, we obtain all the solutions of ax = /?, where 
a == L^J m > P — LAJ m 9 by taking the residue classes of all the solutions of 
ax = b (mod m). In view of the foregoing, taking the residue classes of all 
x 0 + tm\ te Z leaves us, precisely, with the d elements | *o + sm | m of Z m . 
This completes the proof. | 

3-1-6. Examples. We show how the preceding results apply to numerical 
examples. 

(i) Consider the linear congruence 

6x = 2 (mod 7). 

Here m = l , a = 6, b = 2, and d = ( a , m) = (6, 7) = 1, which surely divides 
b — 2. Therefore, a solution exists. 

Obviously, x 0 = 5 is a solution. Since m' = m/d = 7, 3-1-5 tells us that all 
solutions are given by 

{xq + tm 11 g Z} = {5 + It 11 g Z}. 

An explicit listing of all the solutions is 

..., -23, -16, -9, -2, 5, 12, 19, 26, 33,40,.... 

Because d = 1, the set of all solutions consists of a single residue class, |_5_| 7 . 

As for the corresponding linear equation in Z 7 , (_6j 7 x = f_2j 7 , we see 
that it has the single solution |_5_| 7 in Z 7 . Of course, this fact was already 
indicated in part (/) of 3-1-1. 

(ii) Consider the linear equation 

lAlio^=lAlio 

in Z 10 . Corresponding to this, we may consider the linear congruence 
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6 x = 3 (mod 10). Here, m = 10, a = 6, b = 3. Since (a, m) = 2 does not 
divide b = 3, the congruence has no solution. Hence, in virtue of 3-1-2, the 
equation [_6J 10 a; = (_3ji 0 has no solution in Z 10 —a fact that was noted in 
part {ii) of 3-1-1. 

{in) Consider the congruence 

6 x = 9 (mod 15). 

Here ra = 15, a = 6, b = 9. Clearly, d = (a, m) = (6, 15) = 3, and this divides 
b = 9—so a solution exists. It is not hard to find a solution; for example, 
x 0 = 4. (Because the numbers are small and we need to find only one solution, 
it is probably most efficient to find it by trial and error.) Since m' = mjd = 
15/3 = 5, all the solutions are of form 

{x 0 -f tm' 11 e Z} = {4 + 5/11 e Z}. 

In detail, the solutions are 

..., -11, -6, -1,4, 9, 14, 19, 24, 29,.... 

As described in 3-1-5, the set of all solutions decomposes into d— 3 
congruence classes—namely, 

LU„. £!,»• mi,. 

(because the solutions are 1 *o | 15 , | x 0 +m' | 15 , | xp+lm' | 15 ). Note that if we 
start with the solution x 0 = 9, instead of x 0 = 4, we end up with all solutions 
given by three residue classes: |j9j 15 , | 141 15 , | 19 | 15 ; and these are the same 
residue classes as above. Of course, the theorem says that we arrive at the 
same three residue classes no matter which solution x 0 is taken at the start. 

In connection with part {in) of 3-1-1, we now “understand” why the 
linear equation | 6 15 x — HJl5 in Z 15 has exactly three solutions—namely, 
the three elements LiJis» LDis > LllJis of Z 15 . 

{iv) It is time we turned to less trivial examples. Consider the linear 
congruence 

11,799x = 8715 (mod 1081). 

Because 11,799 = 989 (mod 1081) and 8715 = 67 (mod 1081), it follows from 
3-1-3 that the solutions of this congruence are identical with those of 

989* = 67 (mod 1081). 

Here m = 1081, a = 989, b = 67. A straightforward computation gives 
( a , m) — (989, 1081) = 23, which does not divide 67—so there is no solution. 
{v) Consider the linear equation 



218 


III. CONGRUENCES AND POLYNOMIALS 


in Z 20 23 • To solve this, we replace it with the associated linear congruence 

19\x= 1204 (mod 2023) (**) 

for which m = 2023, a = 791, b = 1204. Applying the Euclidean algorithm, 
we obtain d = (a, ra) = (791, 2023) = 7. Since 7 divides b = 1204, a solution 
exists—in fact, there will be seven solutions. 

We need to find a single solution x 0 of our congruence. Guess work is not 
practical here. Instead, we locate a solution of the linear diophantine equation 

191 x + 2023y — 1204. (***) 

From the Euclidean algorithm (which the reader has presumably computed 
already), the gcd 7 = (791, 2023) can be expressed as a linear combination 

7 = 133 • 791 - 52 • 2023. 


Therefore, 


(172 • 133)791 - (172 • 52)2023 = 1204 

and the pair {172 • 133, —172 • 52} is a solution of the linear diophantine 
equation (***). Consequently, 172 • 133 = 22,876 is a solution of the linear 
congruence (**). Because 22,876 = 623 (mod 2023), we know that 623 is a 
solution of (**). We put x 0 = 623. (This choice of x 0 is clearly more convenient 
than taking x 0 = 22,876; of course, the choice of x 0 does not affect the end 
result.) Then m! = m/d = 2023/7 = 289 and all solutions of (**) are given by 
the seven residue classes 

{|623 + s- 289j 2023 1 4 = 0, 1,.... 6}. 

By checking the arithmetic, one sees that 

^2023 9 1 ^ [ 2 Q23 

are the seven solutions of the linear equation (*) in Z 20 23 • 

It may be of interest to note how this last example was constructed. Given 
the desire to produce a linear congruence ax = b (mod m) with exactly seven 
solutions, it was necessary to have d = {a, m) = 7; the choices a = 7 • 113 = 
791, m — 1 • 289 = 2023 clearly accomplished this. It remained to choose 
an integer b divisible by {a, m) = 7. The seven solutions would then be 
( | xq+ 2895 1 | 2023 1 £ = 0, 1,..., 6} where x 0 is any integer which satisfies the 
congruence. In order to guarantee that the work involved in trying to find a 
single solution of ax = b (mod m) by trial and error would be prohibitive, the 
smallest positive integer solution would have to be sufficiently large. (In this 
way, anyone who tried to solve by testing x 0 = 1, 2, 3, 4,... could be expected 


J2023 ’ 

1334 


1490 


2023 


2023 9 


623 


2023 
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to give up the effort.) The choice of x 0 = 45 accomplished this, and because 

791 • 45 = 35,595 = 1204 (mod 2023), 

we were led to the choice of b = 1204. 

Of course, when solving 79lx = 1204 (mod 2023) we found one solution 
to be x 0 = 623. Our method of solution does not lead immediately to the 
smallest positive solution 45. This solution is not lost; it arises when we con¬ 
struct all seven solutions from the single solution x 0 = 623. 

Now that a single linear congruence is under control (in the sense that we 
know how to solve it), it is natural to examine the problem of solving several 
simultaneous congruences. 


3-1-7. Examples. (/) Consider the two congruences 
x = 5 (mod 8), x = 4 (mod 9). 

One way to solve these simultaneously is to note that all solutions of the first 
congruence are given by 


x = 5 -f 81 , t e Z. (#) 

Substituting in the second congruence, we have 5 + 8/ = 4 (mod 9), or 

8/ = — 1 (mod 9). 

Since (8, 9) = 1, which divides — 1, this congruence has a solution; clearly, 
t 0 = 1 is a solution, and then all solutions are of form t = 1 + 9s, s e Z. 
Substituting in (#), it follows that all solutions of our simultaneous congru¬ 
ences are given by 

x —— 5 -f" 8(1 -}- 9s) = 13 -f" 72s, s g Z. 

So we have the infinite set of solutions 

..., 13, 85, 157, 229, 301,... 

and this set can be described simply as the residue class [ 13 | 72 . 

(ii) Similarly, consider the simultaneous congruences 

x = 5 (mod 8), x = 4 (mod 18). 

Again, the first has all its solutions of form 5 + 8/, t e Z, and substituting in 
the second congruence gives 

8/ = — 1 (mod 18). 


But this congruence has no solutions—so our simultaneous congruences have 
no solution. 
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The same procedure surely carries over to more than two simultaneous 
congruences, but things get somewhat tedious as the number of congruences 
increases. In such a situation we can often make use of the following crucial 
result. 


3-1-8. Chinese Remainder Theorem. Suppose the integers m l9 m 2 ,..., m n , 
all of which are greater than 1, are relatively prime in pairs, and we put 
m 1 m 2 ---m n = m. Then, for any choice of integers a i9 a l9 ... 9 a„ the 
simultaneous congruences 


x = a t (mod 
x = a 2 (mod m 2 ), 


x = a n (mod m n ) 

have a solution. Moreover, if x 0 e Z is a solution, then the elements of the 
residue class |_xoJ m constitute the set of all solutions. 


Proof : Let us prove the last part first. Suppose jc 0 is a solution of our 
simultaneous congruences and consider the residue class | *o | m . We need to 
show that for y 0 e Z 

y 0 is a solution o y 0 e \x 0 \ m . 

Now, y 0 e | *o L =>>>0 = x o (mod m)=>y 0 == x 0 (mod for each i = 1,..., n 9 
and then y 0 is clearly a solution of all the congruences. Conversely, if y 0 is a 
solution of the simultaneous congruences, then, for each / = 1,..., n we have 
y 0 = (mod m t ) and x 0 = a t (mod ra f ), so that y 0 = jc 0 (mod In other 
words, mi | (j 0 — *o) f° r ^ = 1,By definition of least common multiple, 
it follows that 


[m u m 2 ,...,m n ]\(y 0 - x 0 ). 

Because m l9 ..., m n are positive and relatively prime in pairs, one sees easily 
that 


[m i9 m 2 ,..., m n ] = 171^2 • • • m n = m. 

Consequently, m\(y 0 — x 0 ), y 0 = x 0 (mod m ), and e [ *0 L ‘ Thus, we have 

shown that the elements of [ | m are indeed all the solutions. 

As for existence of a solution, instead of following the method used in 
3-1-7 which involves working through the congruences one at a time, we give 
a proof which treats all the congruences simultaneously and in a uniform 
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manner. For each j = 1,..., n we put m/ = m/mj; in other words, m/ is the 
product of all m t with the factor nij excluded—symbolically: 

m/ = II m i9 ij = 1, 2,. . . , n. 

* 7^ J 

By hypothesis, m l9 m 2 ,..., m n are relatively prime in pairs, and this may be 
expressed by the notation: 

(m t , Mj) = 1, when i # j. 

Since any is relatively prime to each of the other m h it follows from 1-3-10, 
part (iv) that it is relatively prime to their product—in symbols, 

(m/, rtij) = (n m ; , = 1, j = 1,. .., n. 

Now, according to the theory of a single linear congruence, because 
( m/, mj) = 1 there exists, for each j = 1, an integer bj such that 

m/bj = 1 (mod mj). 

We observe further that 

m/bj = 0 (mod for i # y 

because m^m/ when i^j. Consequently, for each j\ the integer m/bjCij 
satisfies 

m/bjdj = a j (mod m/) 9 
m/bjCij = 0 (mod m t ) for i # j. 

By applying the rule for addition of congruences, we deduce that the integer 

n 

x 0 = m/b 1 a 1 + m/b 2 a 2 + • • • + m n 'b n a n = £ m/bjdj 

j = i 

satisfies x 0 = a t (mod m t ) for all / = 1,2This proves that a solution of 
our simultaneous congruences exists; moreover, in virtue of what we already 
know about the set of all solutions, the solution of our simultaneous linear 
congruences is often said to be unique (mod ra). The proof is now complete. | 

It is worthwhile to examine the idea underlying our proof of the Chinese 
remainder theorem. The simplest nontrivial system of congruences of the 
form considered in the Chinese remainder theorem is surely 

x=l (mod raj), 
x = 0 (mod m 2 ), 
x = 0 (mod m 3 ), 


x = 0 (mod m n ). 
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Of course, there is no reason for specifying the first congruence as the one 
where 1 appears on the right-hand side. In fact, for any choice of j from 
{1, 2,...,«} we have a “ simplest system” of the same type—namely, 

x = 1 (mod mj) 9 

x = 0 (mod m t ) for each i # j. 

We first showed that for each j this system has a solution—namely, m/bj . 
Then from these m/bj , j = 1,..., n we constructed a solution of the general 
system of congruences by taking an appropriate linear combination—namely, 
a^'bi + a 2 m 2 'b 2 + ••• + a n m n 'b n . 

3-1-9. Example. The proof of the Chinese remainder theorem provides a 
straightforward recipe for finding a solution of simultaneous congruences. 
To illustrate, let us solve the simultaneous congruences 

x = 1 (mod 2), 
x = 2 (mod 3), 
x = 3 (mod 5), 
x = 5 (mod 7). 

In this situation we have 

m i =2, m 2 = 3, m 3 = 5, m 4 — 7, 

a t = 1, a 2 = 2, a 3 = 3, a 4 = 5. 

(Of course, a solution exists because m u m 2 , m 3 , m 4 are relatively prime in 
pairs.) Then m = 2 • 3 • 5 • 7 = 210 and 

m/ = 105, m 2 = 70, m 3 = 42, m 4 ' = 30. 

We need to find integers b u b 2 ,b 3 , Z> 4 which satisfy the congruences 105^ = 
1 (mod 2), 70Z> 2 = 1 (mod 3), 42Z> 3 = 1 (mod 5), 30& 4 = 1 (mod 7), respectively. 
This is easy, for by 3-1-3 these congruences reduce to 16 ± = 1 (mod 2), 
\b 2 = 1 (mod 3), lb 3 = 1 (mod 5), 2& 4 = 1 (mod 7), respectively—so we may 
take 

b± = 1, b 2 — 1 ? ^3 == 2, Z> 4 = 4. 

(Note that any choices of the b 9 s are permissible, so long as each one satisfies 
the appropriate congruence. For example, we could take b x = — 1 ,b 2 = —2, 
b 3 — 3, Z> 4 = 4). Now, the theorem tells us that 

x 0 = m±b v a v + m 2 b 2 a 2 + m 3 b 3 a 3 + m 4 .'b 4 a 4 

= (105)(1)(1) + (70)(1)(2) + (42)( —2)(3) + (30)(4)(5) 

= 593 
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is a solution. In fact, the set of all solutions is 1 593 | 210 = | 173 | 21Q , and 173 is 
the smallest positive solution. 

It is important to observe that once we have performed the steps needed 
to solve this system of congruences, we can, almost immediately, write down 
the solution of any other system of congruences with the same m u m 2 , m 3 , 
m 4 . For example, consider the system 

x = 0 (mod 2), 
x = 1 (mod 3), 
x = 4 (mod 5), 
x = 2 (mod 7). 

In this situation, we have, as before, 
m l = 2, m 2 = 3, 

mi — 1^5, m 2 = 70, 

= 1, b 2 = 1, 

Only the a^s are new—namely, 

= 0, a 2 = 1, a 3 = 4, a 4 = 2. 

Thus, a solution is given by 

(105)(1)(0) +(70)(1)(1) + (42)( —2)(4) + (30)(4)(2) = -26 

and | - 26 | 210 is the set of all solutions. 

Of course, the same principle applies whenever we take a system of con¬ 
gruences for which the Chinese remainder theorem is applicable and change 
the a /s but not the m? s. 

3-1-10. Remark. The Chinese remainder theorem may be interpreted in 
another way. Consider the system of congruences 

x = a t (mod m *), / = 1,..., n. (*) 

Looking at the /th congruence, we note that if x 0 is an integer, then, according 
to the definition of residue class, x 0 = a t (mod m t ) o x 0 e \ | m ,. Therefore, 

an integer x 0 is a solution of the system of simultaneous congruences (*) x 0 
belongs to every one of the residue classes , / = 1 ,..., n — that is, <=> x 0 
belongs to the intersection of the residue classes [ | m; , / = 1,...,«. (Here, of 

course, we are viewing a residue class | | Wj as a set of integers.) This says that 
the set of all solutions of the system (*) is precisely the intersection 


m 3 = 5, m 4 = 7, 
m 3 = 42, m 4 = 30, 
b 3 — — 2, b± — 4. 


n _ = «i _ ° a 2 

i = i —— w i — 


n • • • n a. 
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What is this intersection? In the case where the ra* are relatively prime in 
pairs, the Chinese remainder theorem tells us this intersection is nonempty; 
even more, the intersection is a residue class (mod m 1 m 2 * * • m n ). For example, 
from this point of view, the result proved in 3-1-9 may be expressed in the 
form 

3-1-11. Examples. We conclude this section with several additional 
examples. These will indicate how we can go about solving any collection of 
simultaneous linear congruences. 

(i) Consider the simultaneous congruences 

x = 6 (mod 17), 
x = 17 (mod 24), 
x = 13 (mod 33). 

Because m y = 17, m 2 = 24, m 3 = 33 are not relatively prime in pairs, the 
Chinese remainder theorem does not apply. Instead, we must look carefully at 
the last two congruences, wherein lies the source of the difficulty. For an 
integer x 0 , we have [since 24 = 3 • 8 and (3, 8) = 1] 

x 0 is a solution of x = 17 (mod 24) 
o x 0 = 17 (mod 24) 

<=> 241 ( x 0 — 17) 

o 3 | (x 0 — 17) and 8 | (x 0 — 17) 

o x 0 = 17 (mod 3) and x 0 = 17 (mod 8) 

<=> x 0 = 2 (mod 3) and x 0 = 1 (mod 8) 

o x 0 is a solution of both x = 2 (mod 3) and x = 1 (mod 8). 

Thus, the congruence x = 17 (mod 24) may be replaced by the pair of congru¬ 
ences x = 2 (mod 3) and x = 1 (mod 8) without losing or gaining any solutions. 
In similar fashion, the congruence x = 13 (mod 33) can be replaced by the 
pair of congruences x = 1 (mod 3) and x = 2 (mod 11). Therefore, our system 
of three congruences may be replaced by the system 

x = 6 (mod 17), 
x = 2 (mod 3), 
x = 1 (mod 8), 
x = 1 (mod 3), 
x = 2 (mod 11). 
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The second and fourth of these congruences are clearly contradictory—after 
all, an integer cannot be congruent to 2 and also congruent to 1 (mod 3)—so 
our system of congruences has no solution. To put it another way, we have 
shown that 


l«J„ " [>ZL " l»J s . - 0- 

(ii) Consider the three congruences 

x = 5 (mod 18), 
x = —1 (mod 24), 
x = 17 (mod 33). 


Again, the Chinese remainder theorem does not apply, so by the procedure 
used above, we may “break down” each of these congruences and replace 
this system by the system 


(1) x = 1 (mod 2), 

(2) x = 5 (mod 9), 

(3) x = — 1 (mod 3), 

(4) x = — 1 (mod 8), 

(5) x = 2 (mod 3), 

(6) x = 6 (mod 11). 


(Of course, this new system has exactly the same solutions as the original 
system; this is what “ replacement ” is about.) The Chinese remainder theorem 
still does not apply. However, suppose we look at congruences (2), (3), and 
(5). If x 0 is a solution of (2), then, clearly, x 0 is a solution of both (3) and (5). 
[Note that (3) and (5) are really the same congruence.] It follows that congru¬ 
ences (3) and (5) may be discarded without affecting the solution set. Similarly, 
if x 0 is a solution of (4), then it is automatically a solution of (1), so congru¬ 
ence (1) may also be discarded. We are left, therefore, with the system 


x = 5 (mod 9), 
x = — 1 (mod 8), 
x = 6 (mod 11). 


Now, the Chinese remainder theorem does apply. It turns out (the computa¬ 
tions being left to the reader) that this system has the unique solution 
|_743j 792 . Hence, the set 1 743 | 792 consists of all integers which are solutions 
of our original system of linear congruences; this is often expressed as “743 
is the unique solution (mod 792).” 
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From all this, we conclude that 

|i|„/'|-»L^|l7|„-|l|,^|5|.n|-l|,n|-l|,n|2|,n|6|„ 

- lit n Lil» n L5J,i 

-1™U- 

(iii) Consider the pair of congruences 

2x = 5 (mod 6), 

5x = 3 (mod 11). 

The first of these has no solutions so, a fortiori, the simultaneous congruences 
have no solutions. 

(iv) Consider the pair of congruences 

2x = 4 (mod 6), 

5* =3 (mod 11). W 

It is easy to see that the first of these has the two solutions |_2_| 6 and [_5j 6 , and 
the second has the unique solution [5j xx . In other words, x 0 is a solution of the 
first congruence if and only if x 0 = 2 (mod 6) or x 0 = 5 (mod 6), and x 0 is a 
solution of the second congruence o x 0 ~ 5 (mod 11). Therefore, x 0 is a 
solution of the pair of congruences (*) if and only if x 0 is a solution of either 
the pair of congruences 

x = 2 (mod 6), 

(**) 

x = 5 (mod 11), 

or the pair of congruences 

x = 5 (mod 6), 

(***) 

x = 5 (mod 11). 

Thus, to solve (*), we need to solve both (**) and (***). This is easy. By the 
Chinese remainder theorem, (**) has the unique solution [ 38 | 66 and (***) has 
the unique solution |_5_] 66 —so the system (*) has the two solutions [_5j 66 and 
1 38 | 66 . In other words, the solutions of (*) are those integers x 0 which satisfy 
either x 0 = 5 (mod 66) or x 0 = 38 (mod 66). 

The content of our solution of this problem may be expressed compactly 
as follows: The set of solutions of 2x = 4(mod 6) is the set [_2_| 6 u (_5_| 6 , and 
the set of solutions of 5x = 3 (mod 11) is the set |_5jn, so the set of solutions 
of the simultaneous pair (*) is the set 

(lAl. u [£]«) n lil>, - flu. '■ lAlu) u (l£j« n Ilii.) 
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(i?) Consider the simultaneous congruences 

4x = 6 (mod 10), 

9x = 15 (mod 21). 

The first has the two solutions [_4J 10 , [_?Jio> and the second has the three 
solutions [4j 2 i> |_jJ_J 21 , 1 18 \ 2U so to solve the simultaneous congruences we 
must solve each of the following six pairs of congruences. 


(1) 

fx = 4 (mod 10), 

(x ee 4 (mod 21), 

(4) 

(2) 

fx = 9 (mod 10), 

|x = 4 (mod 21), 

(5) 

(3) 

fx = 4 (mod 10), 

(x = 11 (mod 21), 

(6) 


x = 9 (mod 10), 
xs 11 (mod 21), 

= 4 (mod 10), 
18 (mod 21), 
= 9 (mod 10), 
18 (mod 21). 




It is not hard to solve each of these pairs—the results are, respectively, 

LiJzio, I 109 U in . Lzilaio. I 179 l aio , I ^l am. LiiJiio- Thus, an integer x 0 
is a solution if and only if x 0 is congruent to one of 4, 39, 74, 109, 144, 
179 (mod 210). 

There are six solutions (mod 210), and the compact way to describe the 
method for locating the solutions is as follows: The set of solutions of 4x = 
6 (mod 10) is [_4j 10 u [JLlio > an d the set of solutions of 9x = 15 (mod 21) is 
[_4j 21 u 1 11 [ 21 u 1 18 | 21 —so the set of solutions of the simultaneous pair is 


(LiJio u l£]io) ° (HJ21 u UU21 u liiL) 

= (LU10 n HJ21) u (liJio n LiLU) u (LU10 n UU21) 

u (l 9 jio n HJ21) u (l£|io n UU21) u (l£)io n IHI21) 

= HJ210 “ IZiW “ \1^L J210 ^ H»U “ |iZ»U V 139 J „0 


3-1-12 /PROBLEMS 

1. Without solving, determine how many solutions each of the following 
linear congruences has: 

( 1 ) 5x = 89 (mod 99), (//) 9x = 15 (mod 21), 

(iii) 28* = 48 (mod 1197). 

2. Without solving, determine how many solutions each of the following 
linear equations has: 

(') |«*U*=L2J». • («> - 1 1343 U- 

(a)|120|„^-|25[,„. 
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3. For a, /? e Z m find a necessary and sufficient condition that the linear 
equation ax = /? have a unique solution in Z m . 

4. Solve completely: 


(i) 158x= -22 (mod 194), 

07) 194a: = -22 (mod 158), 

(iii) 84x = 156 (mod 605), 

(iv) 84a = 156 (mod 438). 

5. Solve completely: 


(0 |9j 24 *=[6j 24 . 

(“> [ 2 U 2 .*-Lillis- 

| 

£ 

00 

* 

Ul 

X 

II 

4^ 

U> 

00 

£ 

00 


6. Produce a linear congruence which has exactly four solutions. 

7. Consider the following collection of congruences: 

(i) 

x = 3 (mod 7), 

(2) 

x = —2 (mod 11), 

(3) 

x = 7 (mod 12), 

(4) 

x = 8 (mod 13), 

(5) 

x = 9 (mod 14), 

(6) 

x = 1 (mod 15), 

(7) 

x = — 3 (mod 17), 

(8) 

x = 5 (mod 18). 

Solve the system of simultaneous congruences for each of the following 

choices of congruences from this collection. 

(/) (1) and (2), 

(») (2) and (3), 

(iii) (1),(3), and(4), 

(iv) (1), (2), (4), and(7), 

(») (2),(4)-(7), 

(vi) (1), (2), (4), (7),and(8), 

(vii) (l)and(5). 

(viii) (3) and (5), 

( ix ) (3) and (6), 

(x) (3) and (8), 

(xi) (5) and (8), 

(xii) (6) and (8), 

(xiii) (3), (5), and (6), 

(xiv) (5), (7), and(8). 

8. Find the following sets: 


WL21.."ID„' 

W) ID,, n t-4|„ n [7J JS , 

(»i) [2j, r, QJ, n [-2J, n 

HJ„- (>» |nJ 30 ^ |20| S1 . 

9. Evaluate (|_5j 50 u Li2Jso) n 

1 7 | 23 . Can you find a pair of simultaneous 

linear congruences for which this is the set of all solutions ? 

10. Viewing residue classes as sets of integers, prove 

<olUii <= L5 Jt 

VO L!Uj4=1U«" 

wo 111 , - liU 1U. • 


« HJ, = HJ„ ^ |]lk ^ lilk “ 22 „• 
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Of course, the unions in (iii), (iv ), (v) are disjoint. 

11. Suppose m and n are integers greater than or equal to 2 and m\n; then for 
any a e Z, \a} n c= |_oJ m — in fact, the residue class [a_\ m decomposes into 
the disjoint union 

l q L = l a L u l a + w L u l a + 2 wt L u -" l a + ( rf - 1 ) ffl L 

where n = md. 

12. Consider the linear congruence 

ax = b (mod m) (*) 

and suppose d — (<a , ra) divides b. Putting a = da ', b = db\ m = rfm', 
consider the linear congruence 

a'x = V (mod m f ). (**) 

Then (*) and (**) are equivalent in the sense that they have the same set 
of solutions. Prove this. 

Now, (*) has d solutions (mod m), while (**) has a unique solution 
(mod m'). Explain what is going on. 

13. Solve the simultaneous congruences 

3x = 5 (mod 22), 
llx = 3 (mod 28), 

5x = 89 (mod 99). 

14. Solve the pair of congruences 

9x = 6 (mod 24), 

I2x = 14 (mod 26). 

15. Use the method of 3-1-7 to do the various parts of Problem 7 above. 

16. (Sun-Tse, 1st century ad). Find the smallest positive integer which 
leaves the remainders 2, 3, 2 upon division by 3, 5, 7, respectively. 

17. (Brahmagupta, 7th century ad). When the eggs in a basket are removed 
2, 3, 4, 5, 6 at a time there are left over 1, 2, 3, 4, 5, eggs, respectively. 
When they are removed 7 at a time, there is none left over. How many 
eggs are in the basket? 

18. If m = m y m 2 with (m u m 2 ) = 1, show that the set of solutions of ax = 
b (mod m) is identical with the set of solutions of the pair 

ax = b (mod raj, 


ax = b (mod ra 2 ). 
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19. Prove that the pair of congruences 

x = a (mod m), 
x = b (mod n) 

has a solution if and only if (m, n) divides {a — b). Moreover, in this 
situation, the solution is unique mo d[m, n ]—in more detail, if x 0 is a 
solution, then | *o L w1 is the set of all solutions. 

20. Prove the Chinese remainder theorem by using the method illustrated in 
3-1-7. 

21. Suppose p is prime. Then for every a e {1,2, — 1} the congruence 
ax = 1 (mod p ) has a unique solution—that is, there exists a unique 
a' e {1, 2,..., p — 1} for which aa ' = 1 (mod p). Use this to prove 
that 


(p — 1)! = — 1 (mod p). 

This result is known as Wilson’s theorem. It may also be expressed in the 
form: the product of all nonzero elements of Z p is — 1. 


3-2. Units and Fields 

One of the questions considered in the preceding section was that of 
solving the linear equation ax = /? in Z m . The same question may also be 
formulated in a more general context—namely, here and throughout this 
chapter, we let R = {a, b, c ,..., u, v ,...} denote a commutative ring with 
unity e, and make the following definition. 


3-2-1. Definition. Given a, b e R, a # 0, we say that a divides b in R (or 
a is a factor of b , or b is divisible by a, or b is a multiple of a) and write a \ b 
when there exists c e R for which ac = b . There is no requirement that c be 
unique. If no such element c e R exists, we say that a does not divide b in R, 
and write a)(b . 

Thus, a | b if and only if the linear equation ax — b has a solution in R . 


3-2-2. Remark. Of course, the definition of divisibility in R is the same 
as the one given long ago for divisibility in Z. More precisely, we have exten¬ 
ded or generalized the notion of divisibility from Z (which is a commutative 
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ring with unity) to an arbitrary commutative ring with unity, R. Among the 
properties of divisibility in Z that are also valid in R , we have: 

(/) a 10 for every nonzero ae R. 

(ii) If a | b and b \ c , then a | c. 

( Hi) (±e) |a for all ae R; ( ±a ) |a for all a # 0. 

(iv) a\b o a\( — b) o (— a)\b o ( — a)|( — b). 

(v) If a | b and a \ c , then a | (xb + yc) for all x,ye R —in particular 
a | (b + c) and a\(b — c). 

The proofs of these assertions are immediate; they are identical with the proofs 
in Z, as given in 1-1-2. 

On the other hand, many properties of divisibility in Z need not hold in 
R. For example, from a \ e in R we cannot conclude that a — ±e; also, a \ b 
and b\a together need not imply a = ±b. (Illustrations of these statements 
will appear shortly.) Furthermore, there is no reason to expect the division 
algorithm, or anything like it, to hold in R. 

Of course, the question of whether a divides b depends on the ring R in 
which they are viewed. For example, 2^3 in Z; but since 2, 3, 3/2 are all 
perfectly good rational numbers with (2)(3/2) = 3, it follows that 21 3 in Q. 

3-2-3. Examples. (/) The case R = Z is familiar. It was treated in detail 
in Chapter I. We have nothing to add here. 

(ii) Suppose R = Z 7 , which is indeed a commutative ring with unity |_lj 7 . 
Because (_3_| 7 9 \2Ai = anc ^ IAI 7 ' HJ 7 = LSLk > nonzero elements 
1 3 | 7 and [_6j 7 divide each other in Z 7 (that is, |_3_j 7 | [6j 7 and |_6j 7 ||_3j 7 )—but 
note that |_3j 7 # ± [_^J 7 . 

In addition, it is not hard to verify that for any a, /? e Z 7 with a # 0, we 
have a | j8. One way to see this is directly from the multiplication table for 
Z 7 ; the row associated with any a^O includes every /? e Z 7 . Another, more 
sophisticated, way to see this is as follows. Write ol = [a\ n , ^ = \b_\ 1 where 
a, be Z, and 7 J(a since a # 0. Then, as seen in Section 3-1, 

a | p o the linear equation ax = /? has a solution in Z 7 

the linear congruence ax = b (mod 7) has a solution. 

But since (a, 7) = 1, this linear congruence does have a solution; hence, a | /?. 

In particular, in Z 7 , [_2_| 7 1 |_|J 7 , LAJtILUv* HMUJ?, UMLU?* 

IH7IUJ7. 

(iii) Consider the commutative ring with unity, Z 12 . All kinds of interest¬ 
ing things can happen here. We list some facts about Z 12 —they can be 
verified easily by trial and error, or somewhat more efficiently by use of the 
multiplication table for Z 12 . 
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0 ) IJJ 12 divides every element of Z^ 2 > 

( 2 ) |_5 _| 12 is divisible by |jj 12 , HI 12 . LZJ 12 = ~LUi 2 > LH.I 12 = —LUi 2 ; 

no other elements of Z 12 divide | 5| 12 . 

(3) |_ 6 ji 2 divides only itself and | 0| 12 , 

(4) l_ 6 ji 2 is divisible by all nonzero elements of Z 12 except [ 4 [ 12 and | 8 | 12 , 

(5) The identity |jj 12 is divisible only by the four elements [ 1 | 12 , | 5 | )2 , 

LUl 2 .UUl 2 , 

( 6 ) IJJ 12 and |_7 _| 12 divide each other, but [_3j 12 # + [7j 12 , 

(7) Li] 12 and [ 8 Ji 2 divide each other (which is not surprising because 

HJl 2 =-HJl 2 ), 

( 8 ) The result of “dividing” |_§Ji 2 by 1 4 L 2 is not unique, since 

111,, • liL - Hi,, • = L±J, 2 - 1U„ - 111,,' HU,, = [*]„ • 

It is instructive to prove several of these results in a more theoretical 
manner—namely, by the method introduced in (//) above. We write elements 
of Z 12 intheforma = [_^J 12 , ^ = | b | 12 , it being understood that 0 < a,b < 11. 

( 1 ) To decide which elements of Z 12 are divisible by |_5ji 2 , we observe 
that a = |_5ji 2 divides /? = [_ft _| 12 o the linear congruence 5 x = b (mod 12 ) has 
a solution. But according to 3-1-5, this congruence has a solution for every 
b because (5, 12) | b. Thus, a = |_5j 12 divides every /? = |_ft_] 12 in Z 12 . 

(2) To decide which elements of Z 12 divide l_5j 12 3 we observe that a = 

1 a | 12 ¥* 0 divides /? = [5 _| 12 o ax = 5 (mod 12) has a solution o (a, 12) | 5 
o (a, 12) = 1. Thus, the congruence has a solution only when a is prime to 
12 ; and the divisors of |_5ji 2 are [_lj 12 , |j5 _| 12 , |_7j 12 , 1 H \ X2 • 

(3) a = |_ 6 j 12 divides /? = [_ft_|i 2 <=> 6x = b (mod 12) has a solution 
( 6 , 12 ) | ft <=> 61 ft o ft = 0 or 6 . 

(4) a = |_aj 12 0 divides /? = [_ 6 j x2 o ax = 6 (mod 12) has a solution 

(a, 12) | 6 o a =1,2, 3, 5, 6 , 7, 9,10,11. 

The remaining assertions may now safely be left to the reader. 

(iv) Consider Q, the domain of rational numbers (see 2-2-1). The rational 
numbers 37/79 and 47/97 divide each other, since 

/37\ /79*47\ 47 /47\ /97-37\ 37 

\79/ \37*97/ “97 “ \9i) \47-19) “ 79 ‘ 

More generally, any nonzero rational number a divides any rational number 
ft. In fact, a is of form m i ln i with m l9 e Z, n x ^ 0, and ft is of form m 2 ln 2 
with m 2 ,n 2 e Z, n 2 ^ 0; furthermore, because a ^ 0 we have ^ 0. Then 
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and n x m 2 lm x n 2 e Q because n v m 2 e Z, e Z, # 0, so that 

a 1 6. In particular, any nonzero element a e Q divides the identity element 
1=1/1= «/n of Q. 

(V) In the domain of real numbers, R (see 2-2-1), any nonzero element 
divides any real number. (This is a “familiar fact” from elementary school, 
but for us it is really an “ assumption ” about real numbers. It would require 
a massive amount of work to develop R in a careful, axiomatic way and then 
prove this statement about divisibility.) In particular, any nonzero real 
number divides the identity 1. 

(vi) Consider the complex numbers, C. This is a commutative ring with 
unity, 1 +0/; in fact, as seen in 2-2-1, C is an integral domain. Recall that an 
element a e C is of form a = a + bi where a, be R; and a # 0 when a and b 
are not both 0—in other words, a # 0 o a 2 + b 2 ^ 0. 

We assert that every complex number a # 0 divides 1 = 1 + 0L To see 
this, note first that because a ^ 0 division by a 2 + b 2 y in R, is permissible 
(according to part (v) above). Hence a!(a 2 + b 2 ) e R, bj(a 2 + b 2 ) e R, and 



More generally, for any complex numbers a — a + bi^0 and = c + di 
we have a | /?. In fact, it is straightforward to verify that 



ac + bd 
a r Tb 1 


ad — be 

+ ^T ~b 2 


j =• c + di. 


This may look mysterious, but it is not. All we have done is solved the equation 

(a + bi)(x + yi ) = c + di 
by multiplying both sides by a — bi. 

(vii ) Consider the integral domain Z[/], which was discussed in 2-2-4. An 
element of Z[i] is of form a + bi with a, be Z. Because (1 + /)(1 — i) = 2, we 
see that both 1 + i and 1 — / divide 2. We also have (3 + 2i) | ( — 2 + 3/) and 
( — 2 + 3 i) | (3 + 2 /), since — 2 + 3/ = (3 + 2i)i and 3 + 2i = (—2 + 3 i)(—i )— 
but —2 + 3 i ^ ±(3 + 2i). On the other hand, (2 + i))((3 + 2 i). To see this, 
one tries to solve 


(2 + i)(x + yi) = 3 + 2i. 

Multiplying both sides by 2 — / gives 

(5x) + (5y)/ =8 + i. 

This equation has no solutions in integers x and y —so (2 + i ))((3 + 2 1 ) in 

m 
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Clearly, the four elements ±1, ±i are divisors of 1 in Z[/]; it is not hard 
to see that there are no others. 

(viii) Consider the integral domain (of 2-2-5) 

Zh/2] = {a + bs!l\ a, be Z}. 

To decide, for example, if 2 + 3\/2 divides 3 + 5V2, we seek integers x, y 
such that 


(2 + 3\/ 2)(x + y-\J 2) = 3 + 5 V 2. 

Multiplying both sides by 2 — 3\/2 yields 

(-14)* + (- \Ay)\]l = -24 + V2. 

Since no integers satisfy this equation, it follows that (2 + 3\/2)^ (3 + 5\/2) 
in Z[V2]. 

In similar fashion, we see that (2 + 3V2) | (4 + 13\/2). In fact, from 
(2 + + yj 2) = 4+13^2 

we obtain 

(- 14)x + (- 14y)V2 = -70 + 14>/2. 

Thus, x = 5, y — — 1 and indeed (2 + 3V2)(5 — y/2) = 4+13 

Finally, because (1 + V2)(— 1 + >/2) = 1 and (3 + 2\/2)(3 — 2\/2) = 1, 
the elements 1 + V2, — 1 + \/2, 3 + 2\j2, 3 — 2\/2 are divisors of 1. There 
are many other divisors of 1; can you find some of them ? 

3-2-4. Definition. Suppose R is a commutative ring with unity e; an 
element a e R is said to be a unit when a \ e. 

3-2-5. Remarks. (1) According to the definition, a is a unit of R if and 
only if the equation ax — e has a solution in R —or, to put it another way, 
if and only if a has an inverse for multiplication. 

(2) If a is a unit, then its inverse is unique; we denote it by a -1 . 

Proof : Suppose both b and c are inverses of a, so ab = e = ba and 
ac — e — ca . Then 


b — be = b{ac) = (ba)c = ec = c. 

(3) If a is a unit and ab = ac , then b = c; in other words, units can be 
“canceled.” 
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Proof : Multiply both sides of ab = ac by a l . (Note that if a is not a unit, 
then one is not permitted to cancel it. For example, in Z 12 we have 

[±li2 • LU12 = LU12 • li|i2. but LU12 / LUi2—LUi2 may not be canceled 
because it is not a unit.) 

( 4 ) An element of R cannot be both a unit and a zero-divisor. 

Proof : If a is a zero-divisor, there exists c ^ 0 in R such that ac = 0 . If, in 
addition, a is a unit, then a~ x exists and 

c = (a~ 1 a)c = a~ i (ac) = a~ 1 0 = 0 
which contradicts c ^ 0 . 

( 5 ) If a is a unit, then for any be R the equation ax = b has a unique 
solution; in other words, a unit divides any element of R , and the result of this 
“ division ” is unique. 

Proof : Clearly, x = a~ 1 b is a solution, and by cancellation [see (3) above] 
the solution is unique. 

( 6 ) e is a unit, and so is — e. 

( 7 ) If a is a unit, so is a~ l —and (a -1 ) -1 = a. 

Proof: The relation a~ l a = aa~ x = e says that a~ x is a unit whose 
unique inverse is a —hence, (a -1 ) -1 — a. 

( 8 ) If both a and b are units, then so is ab; that is, the product of units is a 
unit. 

Proof: Clearly, b~ x a~ 1 is the inverse of ab. 

( 9 ) For any a # 0 in R, we define 


If a is a unit, we have aa~ 1 = e. From (8), we see that a 1 = a • a is a unit 
whose inverse is {a~ 1 ){a~ 1 ) = (a -1 ) 2 . Therefore, a simple induction guarantees 
that for any positive integer m, a m is a unit whose inverse is (a _1 ) w ; in other 
words, 


(O" 1 = (<T 1 ) m , m > 0. 

Now, if n is negative, then — n > 0, so replacing m by — n in the above, we 
have (a~ n y x = ( a~ 1 )~ n . Then, because we want exponents to behave properly, 
let us define d 1 (that is, a to some negative exponent) by 

(a n ) = (a~ n y 1 =(^" 1 )" n , n< 0. 
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It is now easy to verify the standard rules for exponents—namely, if a and b 
are units of R and m and n are any integers, then 

(i) a m a n = a m+n , 

(«) (# m )" = a™ n e Z. 

(ih) (#Z>) m = # m 6 m , 

The details are left to the reader. Of course, the multiplicative situation here 
is entirely analogous to the additive situation discussed in 2-4-6 and 2-4-7. 

3-2-6. Examples. Pursuing the examples of 3-2-3 a bit further, we observe: 

(/) The only units of Z are +1 and — 1. 

(ii) In Z 7 , the units are all six nonzero elements. 

(iii) The units of Z 12 are the four elements |JJ 12 , [_£ji 2 > |_7j 12 , LHji 2 > 
there are no others. 

(iv) In Q, in R, and in C, every nonzero element is a unit. 

(v) We already know that +1, ±i are units of Z[/]. Are there any others? 
Suppose# + hi is another unit in Z [/]. Then there is an element c A- die Z [i] 
such that 

(a -f bi)(c -f di) = 1. 

By performing the multiplication we see that ac — bd = 1, be + ad = 0. On 
the other hand, by multiplying both sides of the equation by a — bi> we obtain 

( a 2 + b 2 )c + ( a 2 + b 2 )di = a — bi . 

Thus, 

(a 2 + b 2 )c = a , 

(a 2 + b 2 )d = -b. 

Upon multiplying the first of these by c, the second by d , and adding we have 
(a 2 + b 2 )(c 2 + d 2 ) = ac~bd=\. 

This is a contradiction because c and d are integers while a 2 + b 2 >2 
(because, by hypothesis, a + bi ^ 0, ±1, +/). Consequently, ±1, +/ are the 
only units in Z[i]. 

(vi ) We have already observed [in 3-2-3, part (viii)] that 

(l + \/2)(— 1 + V2) = 1. 

Therefore, a = 1 + \Jl is a unit of Z[V^], and so is its inverse a -1 = — \ + \J2 
[which is often written as 1/(1 + V2)]. Then according to 3-2-5, part (8), 
a 2 = 3 + 2 V 2 is a unit, and so is (a -1 ) 2 = 3 — 2\j2. [It may be noted in 
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passing that we have already observed, in 3-2-3, part (viii), that 3 + 2\j2 and 
3 — 2\Jl are units with (3 + 2\/2)(3 — 2y/l) = 1.] By induction, a" and 
(a -1 )" = a“" are units for every positive integer n. We may rephrase this as: 
a" is a unit for every n e Z. Of course, — a" is also a unit, so every element of the 
set 

{+«»=+(1 + V2")|/I6 Z} (*) 

is a unit. 

Note that a = 1 + \J2 is a real number greater than 2, and a -1 then lies 
between 0 and 1. Hence, the elements of the set (*) are distributed along the 
real number line as follows 


• • • < — a 2 < — a < — 2 < — 1 < — a 1 <—a 2 <***<0< 

•••< a" 3 < a" 2 < a -1 < 1 < 2 < a < a 2 < a 3 < •••. 

The set (*) contains an infinite number of distinct elements, so the number of 
units of Z[V2] is surely infinite. However, it may be that there exist units of 
Z[V2] which are not members of the set (*). This is an interesting question 
(and the answer is known), but it is probably inappropriate to try to settle it 
here. 

3-2-7. Definition. A commutative ring with unity in which every nonzero 
element has an inverse for multiplication is said to be a field (it will usually be 
denoted by F). 

In other words, a commutative ring with unity is a field if and only if 
every nonzero element is a unit. 

3-2-8. Examples. (/) In virtue of 3-2-6, Z 7 is a field, and so are Q, R, 
and C. 

(ii) On the other hand, Z 12 and Z[i] are not fields—each has nonzero 
elements which are not units. 

(iii) What about the commutative ring with unity Z[\/2]? Is it a field? 
We have already seen that there are an infinite number of units—in fact, 
±(1 + V2)" is a unit for every n e Z. However, we do not know if there are 
anynonzero elements which are not units. For this, let us consider a typical 
element—say 2 -f 3 V2—and decide if it is a unit. If 

(2 + 3 V 2)(x + y\/ 2) = 1, 

then by the method of 3-2-3, part (viii), 

(- I4)x + (- Uyy/l = 2 - 3V2. 
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Since this has no integer solutions x and y, 2 + 3\/2 is not a unit, so ZlV2] is 
not a field. 

( iv ) Consider 

Q[i] = Q[V-T] = [a + bi | a, b e Q}. 

The verification that this is a commutative ring with unity is straightforward; 
it goes just like the verification that Z[/] is a commutative ring with unity. We 
assert that Q[i] is a field. For this, we take an arbitrary nonzero element 
a + bi, and show it is a unit. We must solve 

(a + bi)(x + yi) = 1 

—where x and y are to come from Q. Multiplying by the “ conjugate ” a — bi, 
we have 


(a 2 -F b 2 )x -F (a 2 -F b 2 )yi = a — bi. 

Of course, a 2 + b 2 # 0, because not both a and b are 0. Hence, we can “ di¬ 
vide ” by a 2 + b 2 in the field Q—so 

a —b 


x = 


a 2 + b' 


e Q, 


y = 


a + b' 


:Q 


provide a solution; that is, 




= 1 . 


and Q[i] is a field. 

(v) It is easy to check that 

0[V^2] = {a + b^J~2\a,be Q} 

is a commutative ring with unity. Moreover, it is a field because the arbitrary 
nonzero element a + b\J — 2 has the inverse 


( 


a 2 + 2b 2 ) \a 2 +2b 2 



3-2-9. Remarks. (1) There is a parallelism between addition and multi¬ 
plication in the axioms for a field. More precisely, both addition and multipli¬ 
cation satisfy (i) closure for the operation, (//) the associative law, (///) the 
commutative law, (iv) existence of an identity for the operation, and ( v ) exis¬ 
tence of an inverse (for addition, this applies to every element; while for 
multiplication, the zero element is excluded). 

(2) In a field F, any equation of form ax = b, a # 0, a, b e Fhas a solution; 
in fact, a~ l b is the unique solution. Thus, any nonzero element divides every 
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element of the field. Of course, as is customary in Q, R, and C, one often 
denotes the inverse a~ i by l/a (where 1 is the unity of F) and then a~ 1 b is also 
written in the form bja . 

(3) If F is a field, then it is an integral domain. To see this, note that if 
a ^ 0 is any element of F , then a is a unit; so according to 3-2-5, part (4) a 
is not a zero-divisor. Thus, Fhas no zero-divisors, and is therefore an integral 
domain. 

The converse is false; for example, Z is an integral domain, but is not a field. 

Now, let us exhibit some additional fields. 


3-2-10. Theorem. For every prime p , Z p is a field. 

Proof : We denote the identity |_lj p of Z p by 1 (this should not cause 
serious confusion). Consider any nonzero element a e Z p , and write it in the 
form a = [a] p for some choice of a e Z. Then 

a has an inverse <=> ax = 1 has a solution in Z p 

o ax = 1 (mod p) has a solution in Z 
<=> ( a , p) divides 1 
o {a,p) = 1. 

Since a^Owe know (from the meaning of residue class) that pfa, so indeed 
( a 9 p ) = 1. It follows that a has an inverse, and consequently, Z p is a field. | 

It is instructive to give another proof of this result. According to 2-1-19, 
Z m is an integral domain m is prime. Hence, Z p is an integral domain with 
a finite number of elements, and to complete the proof it suffices to show: 

3-2-11. Theorem. A finite integral domain is a field. 

Proof: Suppose D is an integral domain with n elements; denote it by 
D = {a u a 2 ,..., a„}. 

Of course, the elements a l9 a 2 ,..., a n are distinct, and the zero element 0 and 
the unity element e appear in this set. 

For any a # 0 in Z), let us look at the elements aa u aa 2 , ..., aa „. If 
aa t = aaj , then, by cancellation (which is permissible in an integral domain), 
a- = aj and i = j. Consequently, the n elements aa u aa 2 ,..., aa n are distinct— 
so they fill up all of D ; in other words, 


D = {aa u aa 2 ,..., aa n }. 
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Now, e appears in this set, which implies the existence of an a t for which 
aa t = e. Thus, a has an inverse, and D is a field. | 

In virtue of 3-2-10, for any prime p 9 every nonzero element of Z p is a unit. 
What about the units of Z m for arbitrary m > 1 ? If m is composite, we know 
that Z m is not an integral domain—hence, it is not a field, and not every non¬ 
zero element is a unit. 

3-2-12. Definition. For m > 1, we let </>(m) denote the number of units of 
Z m \ (/) is known as the Euler ^-function or as the totient function. If we use the 
notations Z w *, for the set of all units of Z w and# for the phrase “the num¬ 
ber of elements in the set,” then 

0(m)=#(Z*). 

3-2-13. Remark. We reinterpret the Euler ^-function. Consider any 
a = [a] m e Z m . By exactly the same argument used in the proof of 3-2-10, 
we see that 

a is a unit in Z m o ( a , m) = 1. (*) 

In other words, a has an inverse in Z m if and only if a and m are relatively 
prime. 

What happens if we use a different representative a ' for a? Clearly, for 
a = we have, as above, 

a is a unit in Z m (a\ m) = 1. 

One immediate conclusion to be drawn is: If = |_f*J m , then (a, m)—\o 
(a\ m) = 1. This is a weak form of the following general fact about con¬ 
gruences : 

If | a | m = [d_\ m [that is, if a = a' (mod m)] then (a, m) — (a', m). (**) 

The proof is easy. Let d = (<a , m) and d ' = (a\ m). We may write a f = a + tm. 
It follows that d\a\ and then d\ d'. Similarly, d' \ d —so d = d\ 

This fact permits us to speak of the greatest common divisor (a, m) when 
a is any element of Z m —namely, we choose any representative a of a and 
define 

(a, m) = ( a , m). 

According to (**), this definition is independent of the choice of representative 
for a. Naturally, when (a, m) = 1, we say that the residue class a is relatively 
prime to m. 

Thus, a is a unit in Z m <=> (a, m) = 1, and consequently (j)(m ) is the 
number of residue classes (mod m) which are relatively prime to m. The m 
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residue classes (mod m) may be listed as 

|±L> 1 2 L> 1 ^ L > ♦ ♦ • > 1 m L - 

To decide how many of these are relatively prime to m , we need only look at 
their respective representatives 1, 2,..., m. The conclusion is: 

0(m) is the number of positive integers 
less than or equal to m that are relatively prime to m. 

Incidentally, this formulation of the definition of 0(m) is the one that is 
found in all number theory books. 

As illustrations, we observe that 0(6) = 2 (of the integers 1, 2, 3, 4, 5, 6 
only 1 and 5 are relatively prime to 6—or, to put it another way, [J_| 6 and 
|_5j 6 are the only units of Z 6 ), 0(12) = 4 (only 1, 5, 7, 11 are relatively prime 
to 12), 0(5) = 4, 0(7) = 6, 0(11) = 10, 0(13) = 12. In fact, for any prime p , 

4>{p)=p- 1 

because 1, 2, 3,..., p — 1 are relatively prime to p , and p is not. More 
generally, we have: 


3-2-14. Proposition. For any prime p and any r > 1 
<i>(p r )=p r -p r - 1 =p r - i (p-i)- 

Proof: Consider the set of jf integers 

Because p is prime, an integer is relatively prime to p r if and only if it is relative¬ 
ly prime to p —so 0( J p r ) equals the number of elements of P which are relatively 
prime to p. Now, let us determine how many elements of P are not relatively 
prime to p. Because p is prime, an integer is not relatively prime to p if and 
only if it is divisible by p. Therefore, the elements of P which are not relatively 
prime to p are 

p,2p,3p,...,p r ~ l •p 

and there are p r ~ i such elements. Hence, the number of elements of P which 
are relatively prime to p is p r — p r ~ l , which proves the desired formula for 
<t>(P% I 

According to this result, we have, for example, 

0(7 3 ) = 1\1 - 1) = 294 and 0(5 5 ) = 5 4 (5 - 1) = 2500. 
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We know how to compute the value of the Euler ^-function for prime 
powers p r , and our objective is to find (j) for any m > 1. To accomplish this, 
we make use of the fact that (j) is “ multiplicative ”—by which is meant that 
the following property holds: 

if (m, n) = 1, then <j)(mn ) = (j)(m)(j)(n ). 

The proof of this property will be .given shortly; but for the moment, let us 
suppose it is true. Then, given any m > 1, we consider its unique factoriza¬ 
tion into prime powers 


m =P['Pi 2 


* * * p 


r s 

s 


where p i9 p 2 ,..., p s are distinct primes and the exponents r u r 2 ,..., r s are 
all greater than or equal to 1. Clearly, p[ l and p r 2 2 • • • p r s s are relatively prime, 
so according to the multiplicative property 

4>{m) = (j) (X‘) (j) (p? ■■■ p r s ‘). 

This procedure may be repeated as many times as necessary; in the end (that 
is, inductively) we arrive at 


<£(m) = <t> 07) 4> O?) • • • 0 (p ?) 

= or - p? - ‘) (p? - pt 1 ) ■ ■ ■ or - ?r ‘) 


p? 


= m 


— m 


(*-?;)• 


i) 


Furthermore, because the p { are precisely those primes which divide m, this 
may be written as 


We have proved: 



3-2-15. Theorem. If m is any integer greater than 1, then 

(f> {m) = m Yl (l-~V 

P I m \ p] 
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As an illustration of this result, suppose m = 36,000 = 2 5 • 3 2 • 5 3 . Then 

0(36,000) = 36,000(1 - *)(1 - 1)(1 -1) 

which is somewhat awkward as we must clear of fractions. Instead, it is more 
efficient to make use of the general expression 

0(m) = (ft - ft" l ) (P? ~ pTl) * • • (P? " fT l ) 

= fr'fr 1 - : 'p r r\pi -1) o > 2 -1) • • • (p s - 1)- 

Thus, we have 

0(36,000) = 2 4 • 3 1 • 5 2 (2 - 1)(3 - 1)(5 - 1) 

= 9600. 

Now, let us turn to the missing part of the proof of 3-2-15. We must prove 

3-2-16. Lemma. If m and n are relatively prime positive integers, then 

cj)(mri) = 0(m)0(«). 


According to our definition of the Euler 0-function, it must be shown that 

#(ZL)=#(Z*)#(Z*), (m, «) = 1. 

We shall give a fairly sophisticated algebraic proof of this formula; it will be 
accomplished in a sequence of steps. 


3-2-17. Theorem. If m > 1, n > 1, (m, n) = 1, then the ring Z mn and the 
direct sum ring Z m © Z„ are isomorphic; that is, 


Proof : A typical element of Z mn is of form (_«J mn where ae Z. We also 
recall (see 2-2-11) that the elements of Z m © Z„ are the ordered pairs (a, ft) 
where ae Z m , /? e Z„, and that the operations of addition and multiplication 
in Z m © Z n are componentwise. 

Let us define a mapping 

/ : Z m ® Z n 


by putting 
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This definition seems to depend on the choice of the representative a of the 
residue class | a \ mn , so we must verify that / is well defined. In other words, we 
must show that if [oJ m „ = , then/fl a' L) =/([a_L); but this is immedi- 

ate from 

\a' I =\a\ => a! = a (mod mn) 

|_| mn I_I mn 

=> a' = a (mod m) and a = a (mod n) 

= and L^J n = K 
=* (KL’ lfj„) = (H-n’ K) 

J = 

The verification that / is a homomorphism is straightforward. For any 
Z m „, we have 

(llL + l*]™) =f( \ a + b \ mn ) 

= ( \a + b\ m , \a + bl ) 

and 

/(H.) =(i£i. K) + (i*i* m 

= ( la + ^>L , |a + ^L ). 

Thus, + [b] mn ) =/(|_«J mn ) +/(lAlm„) and / preserves addition. In 

similar fashion, 

-(W.-lftJ.)" 

and 

/(H J •/( l*L) = K) • (l*]-* l*L) 

Thus,/([a] m „ • |_6j mn ) =/(|_£_L) -/(lAlmn) and/preserves multiplication. 

The next step is to show that/is surjective. Consider an arbitrary element 
of Z m ® Z n ; it is an ordered pair (a, /?) where a e Z m , j Be Z n , and, of course, 
we may write a = [a \ m , /? = |_&J„ where a,be Z. The problem facing us is to 
exhibit the element , |_6j„) of Z m ® Z n as the image under / of some 
element of Z mn —that is, we seek [_£_| mn such that 
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According to the definition of / this means we are searching for ce Z such 
that 

([£]-■ IfJ.)- (111. .£J„)ez.®z. 

—which translates to the requirement that both 

l^L = H. and Ld„ = li| n - 

Putting this in the language of congruences, we want ce Z for which 

c = a (mod m), 
c = b (mod n). 

But now, because m and n are relatively prime (this is the first time we have 
used the hypothesis (i m , n) = 1), the Chinese remainder theorem, 3-1-8, 
guarantees the existence of ce Z which satisfies these simultaneous congru¬ 
ences. This proves that/is surjective. (More precisely, the statement that /is 
surjective may be viewed as a restatement of the Chinese remainder theorem in 
the current circumstances.) 

It remains only to show that / is injective (that is, one-to-one). One proof 
consists of the simple observation that / is a mapping of Z mn , which has mn 
elements, onto Z m ® Z n , which (because it consists of all ordered pairs (a, /?) 
where a e Z m , /? e Z n ) also has mn elements—so / must be one-to-one. 
Another, more algebraic proof, consists of showing that ker/ = (0)—namely, 

IfL e ker /=*/([£.L) = 0 e Z m © Z„ 

=* (Ifjm’ If]*) = (L^|m> l£.L) 

^l£] m =l^Jm and lfj» = L2J« 

=> c = 0 (mod m) and c = 0 (mod n) 

=>m\c and n | c 

=>mn\c [since (m, n) = 1] 

=> c = 0 (mod mn) 

This completes the proof. | 

Now, Z mn is a commutative ring with unity [Jj m „ and Z m © Z„ is a com¬ 
mutative ring with unity , Ijj,,). Because these rings are isomorphic, it is 
only natural to expect that they have the “same” units; in particular, these 
rings should have the same number of units. In other words, if the set of 
units of Z m © Z n is denoted by ( Z m ® Z„)*, we expect to have 

# (ZD = # (Z m © Z„)*. 
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This relation is indeed true. It is an immediate consequence of the following 
general result. 


3-2-18. Proposition. Suppose R is a commutative ring with unity e which 
is isomorphic to the commutative ring R with unity e\ Let f\R-+R! be 
an isomorphism, and for a e R let us put a ' = f(a ); then 

(0 f(e) = e’, 

(ii) there is a one-to-one correspondence between the set of units of R 
and the set of units of R' —namely, u is a unit of R o u' = f(u) is a 
unit of R'. 


Proof : Part (i) is immediate from 2-5-8. As for (ii) — u is a unit of jR=> 
there exists veR such that uv = e=>f(uv) =f(e)=>f(u)f(v) = e'=>u'v f = 
e' => u' is a unit of R'. Conversely, if u ' is a unit of R , then there exists v' e R' 
such that mV = e\ Now, there exist unique elements u 9 v e R for which 
w' = /(m), v' = f(v). Hence, f(uv) = f(u)f(v) = mV = e = f(e) — so uv = e 
(because / is one-to-one) and u is indeed a unit of jR. | 

Our next task is to find the number of units of Z m © Z n —that is, to 
compute #( Z m © Z n )*—in terms of the units of Z m and Z n . This is accom¬ 
plished via our next result. 


3-2-19. Proposition. Suppose is a commutative ring with unity e t and 
R 2 is a commutative ring with unity e 2 . Then the direct sum R t © jR 2 is ^ 
commutative ring with unity (e l9 e 2 ). Moreover, for u i eR i9 u 2 eR 2 , 
(u u u 2 ) is a unit of R t © R 2 if and only if u t is a unit of R t and u 2 is a unit of 
R 2 . 

Proof : The first assertion is trivial. Because both R x and R 2 are com¬ 
mutative, R t @R 2 is clearly a commutative ring; and for any (a u a 2 )e 

R^ © R 2 

(e u e 2 )(a u a 2 ) = (e^, e 2 a 2 ) = (a u a 2 ) 
so (e l9 e 2 ) is a unity element. 

If (u l9 u 2 ) is a unit of R t ® R 29 then there exists (v i9 v 2 ) e R t ® R 2 such 
that (u u u 2 )(v 1? v 2 ) = (e 1? e 2 ); therefore, m 2 v 2 ) = (e 1? e 2 ) and u t v t = e l5 

t/ 2 v 2 = e 2 —so u t is a unit of R t and m 2 is a unit of R 2 . 

Conversely, if u t is a unit of R t and m 2 is a unit of i? 2 , then there exist 
v i eR 1 ,v 2 e R 2 such that u t v t = e u u 2 v 2 = e 2 . Consequently, (m 1? u 2 )(v u v 2 ) = 
(u t v u u 2 v 2 ) = (e l9 e 2 ) and ( u l9 u 2 ) is a unit of R x © R 2 . | 
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3-2-20. Remark. Applying this for a e Z w , /? e Z„, we have: (a, j8) is a 
unit of Z w ® Z n o a is a unit of Z w and /? is a unit of Z„. In other words, 

(a, J?) e (Z m © Z„)* o a e Z* and 0 e Z* . 

It follows that (Z m © Z„)* has the same number of elements as the product 
set Z* x Z* (which consists of all ordered pairs (jx, v) where p e Z* , 
ve Z* ); symbolically, 

#(Z m © Z„)* = #(Z* x Z*), 

But because a product set (of two sets) consists of all ordered pairs, it is clear 
that 

#(Z* x Z* ) = #(Z* )#(Z*). 

Our preliminaries are now complete. To prove 3-2-16 we need only string 
them together—in fact, 

4>{mri) = # (Z *„) 

= #(Z m © Z„)* 

= #(Z m * x Z* ) 

= #(Z;>#(Z*) 

= 4>(m)4>{n). 

A more wordy version of this proof goes as follows: By definition, (f)(mn) is 
the number of units of Z mn . In virtue of 3-2-17, the rings Z mn and Z m ® Z n 
are isomorphic, so according to 3-2-18 their units are in one-to-one correspon¬ 
dence. Hence, <j)(mn ) equals the number of units of Z m ® Z n —which number 
is denoted by #(Z m © Z„)*. Now, 3-2-19 and 3-2-20 tell us that this number 
equals # ( Z * ) # ( Z * ) (the product of the number of units of Z m and the 
number of units of Z„) and this product equals (j)(m)(j)(n ). | 


3-2-21. Examples. (1) If m > 2, then (j)(m ) is even. 

Proof : If m is a power of 2, m = 2 r , then r > 2 because m> 2. Hence, 
(j)(m) = 0(2 r ) = 2 r “ 1 , which is even. If m is not a power of 2 there exists an 
odd prime p that divides m , and we may write 

m = p r m' where (/?, m) = 1 , r > 1 . 

In other words, p r is the power of p which appears in the factorization of m. 
Then 

4>{m) = 4>{m')4>(p r ) = <j>(m’)p r ~ l (p - 1). 

Since p — 1 is even, so is 

(2) Find all m for which (j)(m ) = 14. 
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Solution : The only possible prime factors of m are 2, 3, 5,7, 11,13; for if 
p r is the power of a prime p > 17 which appears in the factorization of m , then 

> <t>(p r ) = P r ~ 1 (p “ 1) ^ 16. 

Consequently, the prime-power factorization of m must look like 

m = 2 r 3 s 5‘7 u l T 13 w , r, $, t, w, i;, w > 0. 

Now, 0(13 w ) = 13 W -\13 - 1) = 13 w -\12) 9 which is not a factor of 0(/w) = 
14 = 2*7; hence, 13 is not a factor of m. Similarly 0(11”) = ll t, “ 1 (10), 
0(7") = 7 U_1 (6), 0(5 r ) = 5* _1 (4)—so 11, 7, 5 are not factors of m. Thus, m is 
reduced to the form m = 2 r 3 s . Furthermore, the form of 0(m) = 2*7 requires 
r <2 and s < 1. One checks the few remaining possibilities and learns that 
there are no integers m for which 0(m) =14. 

3-2-22. Exercise. The crucial step in the evaluation of 0(m) was the proof 
that 0 is multiplicative. This was accomplished in 3-2-16, 3-2-17, 3-2-18, 
3-2-19, and 3-2-20. Our discussion here will lead to a more elementary and 
more standard (but less incisive) proof that 0 is multiplicative. 

(1) A set of integers is called a complete residue system mod m if it contains 
exactly one element from each residue class mod m ; we shall say that such a 
set is a C(m). 

Clearly, the set {0, 1, 2, 3, 4} is a C(5), ( — 3, —2, —1,0, 1, 2, 3} is a C(7), 
and {-12, -4, 4, 13, 22, 82, 91} is a C(7). 

(2) For a set S of integers, the following are equivalent: 

(/) S' is a C(m). 

(ii) Every integer is congruent (mod m) to exactly one element of S. 

(iii) S has m elements, no two of which are congruent (mod m). 

(iv) The elements of the set | ^ e are distinct, and this set 
equals Z m . 

(3) Suppose S = {s u s 2 ,..., s m } is a C(m ); then 

(i) For every integer r, the set 


S + r = {s 1 + r, + r,..., + r} 


is a C(m). 

(ii) If q is relatively prime to m , the set 

qS = {qs u qs 2 , ■ ■ ■, qs m } 


is a C(m). 
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(iii) If r is any integer and q is relatively prime to m , the set 
qS+r = {qsi + r,qs 2 + r,qs m + r} 

is a C(m). 

(4) A set of integers is called a reduced residue system mod m if it contains 
exactly one element from each residue class (mod m) which is prime to m , and 
no other elements; we shall say that such a set is an R(m). 

Clearly, (1, 2, 3, 4} is an R( 5), { — 3, —2, —1, +1, +2, +3} is an R(1 ), 
and {-12,41, -29, 11} is an *(12). 

(5) For a set S of integers, the following are equivalent: 

(/) S is an R(m). 

(ii) Every integer a with (a, m) — 1 is congruent (mod m) to exactly 
one element of S. 

(iii) S has (j)(m ) elements, all of which are relatively prime to m, and 
no two of which are congruent (mod m). 

(iv) The elements of the set {[£] m | s e S} are distinct, and this set 
equals Z m *. 

(6) Suppose S = {s i9 s 2 ,^ (m )} is an R(m); then 

(/) If q is relatively prime to m , the set 

qS = {qs u qs 2 ,qs^ m) } 

is an R(m). 

(ii) If {t u t 2 ,..., ^ (w) } is also an R(m ), then 

0(m) 

n Si = n U (mod m). 

i - 1 i=i 

(7) If a is any integer relatively prime to m , and s 2 ,..., ^ (m) } is an 
R(m ), then 

0(m) 

n S t = n (as,) = a 0(m) n Si (mod m) 

i = 1 i= 1 i — 1 


and it follows that 


a<f>{m) = i (mod m). 

This is a famous result due to Euler (1707—1783). 

In particular, if m is a prime p and a is an integer not divisible by p , then 

a p ~ i = 1 (mod p) 


and, in addition 


a p = a (mod p) 


[a result discovered by Fermat, which was proved earlier in 2-4-9]. 
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( 8 ) Suppose (m,n)=l, S = {s x , s 2 ,..., ^ (w) } is an R(m), and T = 
{t u t 2 ,..., ^ (n) } is an i?(w). Consider the set 

nS + mT = {ns t + mtj \ i = 1,..., 4>(m), j = 1,..., <£(«)} 
which may also be written as 

nS + mT = {ns + nt | s e S, t e T}. 

(i) No two elements of nS + mT are congruent (mod mn ). In 
particular, the elements of nS + mT are distinct and #(nS + mT) 
= 4>(m)4>(n). 

(, ii) Every element of nS + mT is relatively prime to m, and to n also— 
hence, it is prime to mn. 

(iii) If a is an integer prime to mn, then a is congruent (mod mn) to 
some element of nS + mT. 

Putting (/), (ii), (iii) together, we see that nS + mT is an R(mn), so it has 
cj)(mn) elements. Consequently, 

(j)(mn) = (j)(m)(j)(n ). 


3-2-23/ PROBLEMS 

1. Are the following statements true or false? Justify your answers. 

(o IUJIi£U in *24, 

(«)[8j 29 |[i5j 29 in Z 29 , 

(iii) (3 -2i)|(2-23i) in Z[i], 

(iv) (3 - 2 V 2 ) | (2 - 23V2) in z[V3, 

(v) (3 + 4x/2) | (7 - 2V2) in Z|V2], 

(vi) (3 + 4V2) | (7 - 2V2) in Q[VU> 

(vii) (2 + y/6) | (3 - 2V6) in 
(viii) (2 + V 6 ) | (3 - 2^6) in R. 

2. Find all units of 

(0 ^ 6 ? 00 2*15 > (Hi) Z 24 . 

3. Find all units of 

(/) (ii) z[y/^s], (iii) OtV 3 ^]. 

4. We know that Z 13 is a field; find the multiplicative inverse of every 
nonzero element. 
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5. If m is composite, show that Z m is not a field. 

6. (7) Give an example of a nonzero element of a commutative ring with 

unity which is neither a unit nor a zero-divisor. 

(ii) In Z 15 , verify that every nonzero element is either a unit or a 
zero-divisor. 

(iii) Prove that a nonzero element of any residue class ring Z m is either a 
unit or a zero-divisor (but not both). 

7. Prove that q[V2 ] is a field. What about q[V ra], where m is any square- 
free integer, positive or negative? 

8. In an integral domain Z), show that both a \ b and b \ a if and only if 
a = bu where u is a unit of D. 

9. Find as many units as you can in 

(/) Z[V3], («) Z[V6], 

(»o z[V7], (iv) zEVTol 

10. Which of the following are units of Z 851 ? 

(0 [2^ 851 . (H) [H5J 851 , (Hi) liiLWi- (iv) l 10Q U.- 

For each one that is a unit, find its inverse. 

11. Prove the rules for exponents listed in 3-2-5, part (9) for the units a,beR . 
To what extent are they true if a and b are not units? 

12. We have seen in 3-2-5, part (8) that the product of two units is a unit. 
Make a multiplication table for the units of: 

(0 Z 5 , (ii) Z 6 , (iii) Z 7 , (iv) Z 10 , 

(v) Z n , (vi) Z 12 , (vii) Z 14 , ( viii ) Z 15 . 

13. Evaluate (j)(m ) when m is 

(0 ll 15 , («) 2 3 • 7 4 • 13 2 • 17 • 67 3 , (iii) 18,900, 

(it?) 16,807, (v) 305,453. 

14. Show that <f>(m) = 0(2 m) o m is odd. 

15. Find the smallest positive even integer a for which (j)(m ) = a has no 
solution (that is, for which no such m exists). 

16. Derive or justify the expression 

0(m)= n (p^-p^" 1 ). 

P I m 

17. There are 10 integers m for which (f)(m) = 24 = 2 3 • 3; find them. 
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18. (/) If A and B are finite sets, then #(A x B) = ( #A)(#B ). 

(//) If A 2 cz A t and B 2 <= B t are all finite sets, then 

# [(^ — ^4 2 ) x (^i — -®2)] 

= (#v4 1 )(#-5 1 ) — (#^4 1 )(#5 2 ) “ (#^2)(#-®l) + (#^2)(#-®2)‘ 

19. (7) If (ra, «) = 1, then for any a e Z 

Hn n K = Hm„’ (assets). 

(//) What happens if m and n are not relatively prime? 

20. For integers m u m 2 ,..., m r , all greater than 1, show that the map 

a —> (K- lfk> • • • > IfJ J 

of 


z-► z M1 ® z W2 ® * • • ® z mr 

is a homomorphism. What is its kernel? When is it surjective? 

21. (/) In Z 7 , compute a 6 for each a # 0. 

0*0 In Z 12 , compute oc 4 for a = |_j_| 12 , [5j 12 , |_7j 12 , [jjj 12 . 

(Hi) For each a ^ 0 in Z 13 compute a 12 . 

Why do the answers come out as they do ? 

22. Is the congruence 

ll 7 17 37 = 1 1 (mod 7 • 17 • 37) 
valid ? How is this related to the result of Fermat, 2-4-9 ? 

23. Suppose R is a C(m), S' is a C(«), and (m, n) = 1. Show that + mS is a 
C(mn). 

24. For each of the following values of m compute <j)(d ); that is, compute 

the sum of all (j)(d ) as d runs over all positive divisors of m . (Note that 1 
is always counted as a divisor of m , and so is m.) 

(0 24, (//) 27, (iff) 49, (to) 50, (v) 60. 

25. (0 Prove that if p is prime and r > 1, then 

t<Kp i ) = p'- 

i = 0 

(//) For arbitrary ra > 1 prove that 

£ <£(d) = m 

d | m 

by using an induction based on writing m = p r n where p is prime 
and (/?, n) = 1. 
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(Hi) Give an alternate proof of (») by showing that for each divisor d of m, 
the number of elements from the set {1, 2,..., m} which have d as 
their greatest common divisor with m is precisely <j)(m/d). 

26. Is a subring of a field a field ? Is it a domain ? Explain. 

27. (/) Suppose /: R -► S is an isomorphism of rings. If the element ae R has 

an inverse for multiplication, so does f(a ) e S —and f(a~ i ) = (/(a))" 1 . 
Does this assertion apply when /: R-+ S is a surjective homomor¬ 
phism? 

(ii) Prove: If R and S are isomorphic rings, then R is a domain S is a 
domain, and R is a field o S is a field. 

28. Let a be a formal symbol, and consider 

Z 2 [a] = {a + ba | a, b e Z 2 }. 

Thus, Z 2 [a] consists of the four elements 0 = 0 + Oa, 1 = 1 + Oa, 
a = 0 + 1 a, and 1 + a. Define addition and multiplication in Z 2 [a] just 
as for polynomials except that we write a 2 = 1 + a. Make addition and 
multiplication tables for Z 2 [a], and show that it is a field with four 
elements. 

3-3. Polynomials and Polynomial Functions 

Before we discuss solving expressions like 

a 0 + ci\X + #2 ^ “i~ * * * ci n x? = 0 

in certain special situations, it is appropriate to examine in some detail what 
is meant by such an expression. This requires a good deal of fussing, and is 
not very exciting—but it needs to be done. 

3-3-1. Notation. Throughout this section we suppose {i?, 1, •} is a com¬ 
mutative ring with unity. The elements of R will be denoted by a,b,c 9 _In 

addition, henceforth, we shall denote the unity element of R by 1 instead of 
e; this will be a small convenience and should cause no difficulty. Note the 
choice of _L to denote addition in R . The more usual symbol + is reserved 
temporarily for something else. 

Let x be a symbol which has no connection of any kind with R ; in particu¬ 
lar, x is not an element of R. Such a formal symbol x is called an indeterminate 
over R or a transcendental over R. Consider all “formal ” symbols or expres¬ 
sions of form 


a 0 x° + a i* 1 + a 2 x 2 + • • • + a n x? 


(*) 
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where n is an integer greater than or equal to 0 and a 0 , a l9 a 2 ,..., a n are 
elements of R. Any expression of form (*) is called a polynomial in x with 
coefficients in R , or in abbreviated terminology, a polynomial in x over R. [Of 
course, it is more common to write polynomials in the form 

a 0 + a x x + a 2 x 2 + • • • + a n x n 

and we shall eventually do so; however, at this stage, there are certain small 
technical advantages in using the notation of (*).] The set of all such poly¬ 
nomials, for all choices of n > 0 and all choices of a 09 a i9 ... 9 a n in R , is de¬ 
noted by R[x]. Our immediate objective is to show how R[x] can be made into 
a ring—but first a number of preliminary remarks are in order. 

Note first that the notation for a polynomial is somewhat deceptive. The 
symbol + has nothing to do with addition; at this point, it is merely a symbol 
that provides separation between parts of a polynomial—any other symbol 
could serve just as well. The symbols x°, x 1 , x 2 ,..., x n which should be read 
as x upper zero, x upper 1 , x upper 2, ..., x upper n 9 have nothing to do with 
powers of x (after all, since x is not in R , multiplying x by itself has no mean¬ 
ing), they are just formal symbols. Furthermore, pieces of a polynomial like 
a 0 x° 9 tfix 1 ,..., a n x" have no connection with multiplication (after all, we 
have no way of multiplying an element a t e R and a formal symbol x* which 
has no connection with R) 9 we simply prefer to write things in this way. 

One may naturally ask: if the symbols are not what they seem to be, why 
not put the notation in another, less suggestive, form? The answer is that, in 
the end, all our symbols will behave exactly as the notation leads us to expect. 

3-3-2. Conventions. By a term of the polynomial 

a 0 x° + a^x 1 + a 2 x 2 + • • • + a n x" (*) 

we mean any one of the symbols a 0 x°, ^x 1 , a 2 x 2 9 ..., a n x As indicated 
earlier, the elements a 09 a l9 a 2 ,..., a n of R should be referred to as the 
coefficients of the polynomial, or of the respective terms. The term a n x lt is 
called the leading term and its coefficient a n is known as the leading coefficient 
of the polynomial. It is customary (but not essential) to arrange the notation 
so that a n # 0. If a n = 1, the polynomial is said to be monic. 

According to the definition, an element of Z[x]—that is, a polynomial in 
x over Z—should look something like 2x° + 3X 1 + lx 1 + 5x 3 + 4x 4 . On the 
other hand, in high school and elsewhere, we often came upon polynomials 
that looked like 3x 6 — 2x 4 + x 2 — 1, or 5x 7 , or x 8 — x 2 — 2x 3 + 3x 4 . These 
differ from our formal definition of polynomial in several ways—for example, 
the use of the — sign, the appearance of an unadorned element of Z (without 
an x l ), writing a term like x 2 without a coefficient, putting terms in the wrong 
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order, or omitting some terms. The apparent discrepancies will be ironed out 
eventually. 

A typical element of i?[x] will be denoted by /(x)—that is, 
fix ) = a 0 x° + a^x 1 + a 2 x 2 + • • • + a n x" 
and this expression may also be written compactly as 

fix) = a,x'. 

i = 0 

Note that f(x) is not a function; it represents a polynomial and nothing else. 
Also, it is not quite appropriate to refer to I as “ sum ” or 66 summation,” 
since + does not represent addition; it is probably best at this point to read 
the symbol I as “sigma.” 

Now, suppose g(x) e jR[x] is another polynomial—say 

m 

g(x) = b 0 x° + b t x l + b 2 x 2 + • • • + b m x m = £ fc.x f . 

i = o 

We shall say that the polynomials f(x) and g(x) are equal, and write f(x) = 
g(x ), when a t = b t for all i. Strictly speaking this definition has meaning only 
when m = n; but when m # «, say men, it is to be understood that we are 
taking b m+l = b m+2 = • • • = b n = 0. Since we have permitted the introduction 
of terms with 0 coefficient without affecting the polynomial, it is only natural 
that we should also permit the dropping of terms with 0 coefficient (even if 
they are in the middle) without affecting the polynomial. Thus, for example, 
the polynomials a 0 x° +a 3 x 3 + a 5 x 5 and a 0 x° + Ox 1 + Ox 2 + a 3 x 3 + Ox 4 
4* a 5 x 5 are considered to be equal. In general, any missing term is consi¬ 
dered to have coefficient 0. Note that any term a { x l of a polynomial thus 
becomes a polynomial (that is, an element of i?[x]) in its own right—namely, 

a t x l = 0x° + Ox 1 + • • • + Ox 1-1 + <z f x\ 

Once we have agreed about throwing in or throwing out terms with co¬ 
efficient 0, why not go all the way? Instead of writing a generic polynomial as 

n 

fix ) = a 0 x° + a^x 1 + • • • + a n x n = £ a t x l for some n > 0, 

i = 0 

let us write it in the form 


fix) = £ 

/ = o 

with the understanding that almost all a t equal 0 (that is, only a finite number 
of the coefficients a t are not equal to 0). Consequently, if gix) =X? = o b t x l is 
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also a polynomial in x over R , then our definition of equality for polynomials 
takes the form 


f(x) = g(x ) a t = bi , for i = 0, 1, 2 ,.... 

Now, let us turn to addition and multiplication of polynomials. From our 
high school experience, this is a familiar matter. For example, using the 
standard high school notation: If f(x) and g(x) are the polynomials with 
integer coefficients 

f{x) = 2 + 3x 2 + x 3 , g(x) = 1 - x + 2x 4 , 
then these may be rewritten as 

fix) = 2 + Ox + 3x 2 + lx 3 + Ox 4 , 
gix)= 1 + (— l)x + Ox 2 + Ox 3 + 2x 4 , 

and then 

f(x) + gi%) = (2 + 1) + (0 + (— 1))* + (3 + 0)x 2 + (1 + 0)x 3 + (0 + 2)x 4 
= 3 — x + 3x 2 + x 3 + 2x 4 

and 

fix) • gix) = [(2X1)] + [(2)( — 1) + (0)(l)]x + [(2X0) + (0)(- 1) + (3)(l)]x 2 
+ [(2)(0) + (0)(0) + (3)(— 1) + (l)(l)]x 3 
+ [(2X2) + (0X0) + (3)(0) + (IX-1) + (0)(l)]x 4 
+ [(0)(2) + (3X0) + (1X0) + (0)(- 1)J* 5 
+ [(3)(2) + (l)(0) + (0)(0)]x 6 
+ [(l)(2) + (0)(0)]x 7 + [(0)(2)]x 8 
= 2 — 2x + 3x 2 — 2x 3 + 3x 4 + 6x 6 + 2x 7 

It is precisely these familiar rules for addition and multiplication of poly¬ 
nomials which we carry over to i?[x]. 

In detail: Suppose { R , 1, •} is a commutative ring with unity and consider 
any polynomials 

f(x) = z a > xi > d(x) = Z b t x l 
0 0 

in /?[*]. We define their sum f(x) © g(x) by 

(z © (z b i x ) = Z (°i 1 b t) x ‘- 

In other words, as expected, we have put fix) © gix) = Zq c i x ' where c t = 
a { 1 b t for each i = 0, 1,2,_(Note that since a t , b t come from R , their sum 
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is denoted by a t 1 b t .) Of course, c t = a t 1 b t = 0 for almost all f, because a t 
equals 0 for almost all i and so does b ( \ thus Ig 5 ( a i -L b^x 1 has only a finite 
number of nonzero coefficients, and it is indeed a polynomial in x over R. 

The definition of the product f(x) Og(x) is somewhat more complicated 
than the definition of the sum f(x) ®g(x); but again, as for ®, it is entirely 
in keeping with our high school procedures—namely, 

11 apc'j O yZ b i x ‘j = I c i x ‘ 

where Cq — — Gq * Zjq, q = Gq * b± J_ g^ * b§ , c 2 = Gq * ^2 -L g± * b± _L G2 * ^0 > 
c 3 = a 0 • b 3 1 a t • b 2 1 g 2 * b t 1 a 3 • b 0 ,..., and in general, for any i = 
0,1,2, 3,... 

i 

Ci = a 0 - 1 ai • b i - l 1 • • • 1 a i _ 1 • b t 1 a f • b 0 = £ a r • 6 f _ r . 

r= 0 

Clearly, c* is the sum of all possible products a r • b s where r + s = /, so it may 
be written as 


Ci= £ o r -b s , 0 <r,s<i. 

r + s = i 

(Here, £ denotes summation in jR and, of course, • denotes multiplication in R. 
We have used the symbols ® and O for addition and multiplication in R[x] 
because the symbols 1, *, + already have meaning, and we prefer to maintain 
the distinction between the new operations and the old ones.) We must 
verify that f(x) O g(x) = £q c { x 1 is really an element of jR[x]— that is, that 
c. = 0 for almost all i. To do this, let a n be the last nonzero coefficient of f(x) 
and let b m be the last nonzero coefficient of g(x). Thus, a t = 0 for all i > n, 
and b t = 0 for all i> m; so we may write 

n m 

f(x) = X diX 1 , g(x) = £ biX 1 . 

0 0 

Consequently, in virtue of our high school experience, we surely expect to 
equal 0 for every i> n + m. In fact, when i> n + m is fixed, each of the terms 
a r • b s which appear in c t = £ r+s =i a r • b s must be 0. To see this, note that if 
i> n + m and r + s = i with 0 <r,s <i, then either r> nor s> mor both— 
for if r < n and s <m, then i = r + s < n + m —so, by the way n and m are 
chosen, in each term a r • b s either a r = 0 or b s = 0 or both. Hence, c t = 0 for 
i> n + m an df(x) O g(x) is an element of jR[x] which may also be written in 
the form 

n + m 

f(x) O g(x) = X c i xi - 
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It is worth noting, once more, that the multiplication we have defined in 
R[x] is entirely consistent with the way the reader has always multiplied poly¬ 
nomials, as illustrated above for/(x) = 2 -f 3x 2 + x 3 and g(x) = 1 — x + 2x 4 
in Z[x]. 


3-3-3. Theorem. If R is a commutative ring with unity, then R[x] is a 
commutative ring with unity which contains an isomorphic copy of R ^ 
More precisely, there is a “ natural ” injective homomorphism of the ring 
{R, 1, •} into the ring (i?[x], ®, O}. 

Proof : First of all, we need to verify that {jR[x], ®, O} is a commutative 
ring with unity. Closure has already been proved for both ® and O- The 
associative law for addition holds: For if /(x) = Zq a i x \ 9( x ) = 
h{x) = Zq c i x1 are any elements of jR[x], then, in virtue of the associative law 
for 1 in R , it is easy to see that both polynomials(/(x) ® g{x)) ® h(x) and /(x) 
®(g(x) ® h(x)) are equal toZJ (a t ± b t 1 c £ )x\ The polynomial Zq Ox 1 = 0x° 
-F Ox 1 + Ox 2 + • • • is clearly an identity for addition—we denote it by 0. In 
addition, the polynomial Zq ( — a t )x l is surely an additive inverse for /(x) = 
Zo a { x l —so we shall write — (/(*)) for Zq (— a^x 1 . As for the commutative 
law for addition, for any i the coefficient of x 1 in/(x) ® g(x) is a x ± , which 

is equal to b t 1 a i9 the coefficient of x 1 in g(x) ®/(x). 

Turning to the requirements for O in i?[x], we note first that the polynomial 

\ x ° = \ x ° + Ox 1 + Ox 2 + Ox 3 + ••• 

is obviously an identity element for multiplication. The commutative law for 
O holds because, for each /, the coefficient of x* in /(x) O g( x ) is 

a 0 • b t 1 a x • b i _ 1 1 • • • 1 a i . i • b x 1 a t • b 0 

while the coefficient of x 1 in g(x) O f( x ) is 

b 0 • a t 1 b t • 1 • • • 1 t • a t 1 b t • a 0 

—and these are equal because both addition and multiplication in R are 
commutative. To prove the validity of the associative law for O, we must show 
that, for each i > 0, the coefficient of x 1 in (/(x) O #(*)) O h{x ) is equal to the 
coefficient of x‘ in /(x) ®(g(x) O h(x)). By being careful about subscripts, it 
may be shown that in each case the coefficient of x* turns out to be 

Z a r'b s -c t , 0 <r,s,t<i 

r+s+t=i 

(that is, the sum of all products of form a r -b s • c t where the sum of the sub¬ 
scripts is /). The details are left to the reader. 

Finally, it is only necessary to check one distributive law, because O is 
commutative. This is a straightforward matter which may safely be left to the 
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reader. The proof that {7?[x], ®, O} is a commutative ring with unity is now 
complete, so we turn to the “imbedding” or “injection” of 7? into R[x]. 

Let us define the mapping 0: 7? 7?[x] by 

0(a) = ax °, ae R. 

In other words, the mapping 0 takes the element a e R to the polynomial 
ax° = ax°+ Ox 1 + Ox 2 + ••• in R[x]. The verification that 0 is a homo¬ 
morphism is easy; in fact, for a, 6 e 7? we have 

0(a 1 b) = (a 1 b)x° = (ax°) ® (bx°) = 0(a) © 0(6), 

0(a • b) = (a • b)x° = (ax°) O (6x°) = 0(a) O 0(6). 

(Note the distinction which must be made between the operations J_ and • 
in R as opposed to ® and O in R[x].) Since 0 is a homomorphism, we know 
(from 2-5-8) that 0(0) = 0 or, what is the same thing, 0 e ker 0. Furthermore, 
if a e ker 0, then 0(a) = 0, which says that ax° = ax° + Ox 1 + Ox 2 + • • • is 
the zero polynomial—hence, a = 0. Consequently, ker 0 = (0) and according 
to 2-5-9, 0 is injective. Thus, 0 is a (“ natural ”) injective homomorphism 
of the ring (7?, 1, •} into the ring { R[x ], ©, O}. 

Now, making use of parts (1) and (4) of 2-5-9, the image 

0(7?) = {ax° | a e R} 

of the homomorphism 0 is a subring of {7?[x], ®, O} and, because 0 is 
injective, 0 is an isomorphism of R onto 0(7?). In particular, 0(7?) (or better, 
(0(7?), ®, O}) is an isomorphic copy of 7? which is contained in 7?[x], and it 
is also customary to say that 0 is an “imbedding” of 7? into 7?[x]. | 

3-3-4. Notation. Having proved that {7?[x], ®, O} is a ring, let us sketch 
how, and with what justification, the notation may be adjusted to make it 
conform with the standard notation. 

For i > 0, let us write lx* as x*. Furthermore, let us denote x 1 by x. Each 
x* may be viewed as representing the term lx 1 in some polynomial, and also as 
the polynomial 0x° + Ox 1 + • • • + lx* + 0x I+1 + • • •. In particular, x may be 
viewed as an element of 7?[x], namely, x = 0x° + lx 1 . Once x belongs to the 
ring 7?[x], we may multiply it by itself—thus 

x O x = (0x° + lx 1 ) O (0x° + lx 1 ) = 0x° + Ox 1 + lx 2 = lx 2 = x 2 
and, more generally, for all / > 2 we have, by induction 

xOjc0-0jc = (0x° + lx 1 ) O (0x° + lx 1 ) O * • • O (0x° + lx 1 ) 

= 0x° + Ox 1 + • • • + 0x* _1 + lx* 

= lx* 

= x f . 


i terms 
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According to this, x 1 is what it looks like! More precisely, the result of 
multiplying i copies of x in the ring R[x] is x\ which means x to the power i; 
on the other hand, heretofore in this section we have been using x l solely as a 
formal symbol, called “ x upper i.” In virtue of the above, the formal symbol 
x 1 is to be interpreted as the ith power of x for all / > 1. 

We observe next that the symbol x° has two possible meanings. On the 
one hand, it is x upper zero which is the same as the element lx° of R[x]. On 
the other hand, x° can refer to raising the element x of jR[x] to the zero power. 
Since in a ring with identity the zero power of any nonzero element is the 
identity, we see that x° (that is, x to the power zero) equals lx°, the identity 
for multiplication in the ring R[x]. Thus, the two interpretations of x° give the 
same result, and our notations are still compatible. 

We have already proved that the ring R is isomorphic to the subring 
(j)(R) = { ax° | a e R} of jR[x], so let us replace the elements of (f)(R) by the 
corresponding elements of R. Thus, we denote the polynomial ax° by a , and 
in writing polynomials we simply omit the x°. An arbitrary polynomial now 
takes the familiar form 


a 0 + a t x + a 2 x 2 + ••• + a n x n . (*) 

If a is any element of R , then a also represents the polynomial a + Ox 
4* Ox 2 + • • • in R[x]. Consequently, R may be considered as a subset of R[x] — 
that is, R cz R[x]. 

Now, consider any element a t e R and any power x\ Both of these are 
elements of jR[x], so it is of interest to find their product (with respect of O, 
of course). We have 

a { O x l = a t x° O l* 1 = (tf* x° -4 Ox 1 -4 • • •) O (0x° -4 Ox 1 -4 • • • -4 lx 1 ) 

= 0x° + Ox 1 + • • • + (a t • l)x 1 ' 

= a t x l 

which tells us that the polynomial a { x l in R[x] (or the term a t x l of some 
polynomial) is exactly what it looks like—it represents the product in R[x] of 
the two elements a t and x l of jR[x]. 

Given two polynomials (or rather, terms) a { x l and ajX j where i < j , let us 
compute their sum (with respect to ®, of course). We have 

a t x 1 ® djX j = (0 4- Ox 4- * * * + a i x i + • • •) ® (0 + Ox + • • • + djX j + • • •) 

= 0 4- Ox 4- • • • + d t x l 4- • • • + dj x j 4- • • • 

= d t x l 4- dj x j 

—so it follows inductively that for any polynomial (*) of jR[x] we have 
d 0 4- d t x 4- d 2 x 2 4- d n x n = d 0 © d t x ® d 2 x 2 © • • • © d n x". 



3-3. POLYNOMIALS AND POLYNOMIAL FUNCTIONS 261 


Thus, the formal symbol + may be replaced by © (the symbol for addition in 
R[x]) whenever it appears, and our polynomial is built up from the element x 
and the elements a i9 ..., a n of R (all of which belong to R[x]) by repeated 
addition, ®, and multiplication, ©• Furthermore, because © is a commuta¬ 
tive operation, the order in which the terms of a polynomial a 0 © a x x © • • • 
© a n x" (which is also the same as a 0 © a { O x © • • • © a n © x n ) appear does 
not matter. 

It remains to examine the imbedding of R in i?[x]. We have agreed to use 
the symbol a to represent the polynomial ax° = ax° + Ox + • • • (such a 
polynomial is known as a constant polynomial, and, in general, the term a 0 of 
the polynomial a 0 + a t x + a 2 x 2 + • • • + a n x" is known as the constant term); 
the justification being the fact that the rings (i£, 1, •} and {$(/?), ©, O} 
(which is a subring of {R[x] 9 ©, ©}) are isomorphic. Any elements a 9 be R 
may be viewed as a = ax° 9 b = bx° in i?[x], so adding them with respect to the 
operation © in R[x] 9 we have 

a © b = ax° © bx° = (a _L b)x° = a _L b 

This means that, on elements of R 9 the operations © and 1 are the “same.” 
Similarly, the operations © and • are the same on elements of R , since 

aQ b = ax° O bx° = (a • b)x° = a • b. 

Therefore, the operations 1 and • may be replaced by © and O, respectively, 
whenever they appear, and everything remains compatible. In particular, the 
ring {jR, _L, •} is now written as {R 9 ©, ©}, and it is clearly a subring of 
{i?[x], ©, ©}. The symbols 1, *, and + have been dropped completely! 

If one insists on having the notation conform exactly with the familiar 
notation, it is only necessary to replace © by the symbol + and O by the 
symbol •; thereby concluding with the ring {i?[x], +, •} and the subring 
{jR, +, •}. In any event, the — sign represents the usual additive inverse in a 
ring; so, for example, the term — (a t x l ) represents ( — a^x 1 , and a 0 — a t x 

— a 2 x 2 + a 3 x 3 is another way of writing a 0 + ( — a t )x + ( — a 2 )x 2 + a 3 x 3 . 

This completes our discussion of the notation in a polynomial ring. 

Given any polynomial /(x) = a 0 + a x x + • • • + a n x n e i?[x] with a n ^ 0 we 
define its degree [which is written as deg/(x)] to be n. In other words, for a 
polynomial/(x) = a i x \ degree is the highest power or exponent of x 
which has a nonzero coefficient. (For example, the polynomial 2 — 3x + x 2 

— 6x 4 e Z[x] is of degree 4; however, if this same polynomial is viewed as a 
polynomial in Z 3 [x], then its degree is 2.) It is understood that a constant 
polynomial—meaning one of form /(x) = a —has degree 0, provided a # 0. 
This leaves the zero polynomial /(x) = 0 = 0 + Ox + Ox 2 + • • •; we define its 
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degree to be — oo, where it is taken for granted that — oo is less than any 
integer, and — oo plus any integer equals — oo. 

7 > 


A ^-3-5. Proposition. Suppose any two polynomials f(x) and g(x) in i?[x] 
are given; then 

(0 deg(/(x) + g(x)) < max{deg/(x), deg g(x)}. 

Moreover, if deg/(x) # deg^(x), then equality holds. 

(») deg(/(x) • g{x)) < deg f{x) + deg g(x). 

Moreover, if the leading coefficient (by which we mean the last nonzero 
coefficient) of just one of these polynomials is not a zero-divisor, then 
equality holds. 

Proof: This is really a familiar result from high school, except for some 
slight complications which may be caused by the nature of the ring R . 

If either f(x) = 0 or g(x) = 0 or both, it is easy to see that both (/) and (ii) 
hold. The details, which depend on our conventions about — oo, may safely 
be left to the reader. 

Suppose, therefore, that f(x) # 0 and g(x) ^ 0; so we have f(x) — 
a 0 -f a x x -f • • • + a n x n with a n ^0 and g(x) = b 0 + b t x -f • • • + b m x m with 
b m 0. In particular, deg/(x) = n >0 and deg g{x) = m > 0. If deg/(x) # 
deg g(x ), there is nothing lost in assuming that deg/(x) < deg g{x) ; thus n <m 
and the leading term of f(x) ± g(x) is ±b m x m —so 

deg(/(x) ± g{x)) = m = max{deg/(x), deg g(x)} 

which proves the second part of (/). If, however, deg/(x) = deg#(x), then 
m — n and 

fix) ± g(x) = (a 0 ± b 0 ) + (a t ± b t )x + --- + (a m ± bjx”'. 

Unfortunately, it may well be that a m ± b m = 0, so the only conclusion we can 
draw is 

deg(/(x) ±g(x)) < m = max{deg/(x), deg^(x)}. 

This completes the proof of (/). 

As for (//), surely f(x) • g(x) has no term after a n b m x n+m whose coefficient 
is not equal to 0. Now, it may well be that a n b m — 0, so the only conclusion we 
can draw is 

deg(/(x) • g(x)) <n + m = deg fix) + deg g(x). 

However, a n b m = 0 cannot occur if either of the factors is not a zero-divisor, 
so in this case deg(/(x) • g(x )) = n + m and equality holds. This completes 
the proof of (ii). | 
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To illustrate this result, consider f(x) = 1 —x +6x 3 ,&(x) = 2 + x + 18;v~ 
h(x) = 1 + \2x 2 . When these are viewed as polynomials in Z[x ], we hove 

deg/(x) = 3, deg g(x) = 3, deg h(x) « 2. 

deg(/(x) + #(*)) = 3, degO(x) + h(xj) = 3, deg(/(x) + h(xj) = 3, 

deg(/(x) • #(*)) = 6, deg(#0) • A(*)) = 5, deg(/(x) • A(x)) = 5. 

On the other hand, these polynomials can also be viewed in Z 24 [x]. The degrees 
of fix), g{x ), h(x) do not change, but in Z 24 [x ] we have 

f(x) + gix) = 3, 

gix) + hix) = 3 + x -F \2x 2 -F 18x 3 , 

fix) F hix) = 2 — x -F I2x 2 -F 6x 3 , 

fix) • gix) = 2 — x — x 2 -F 6x 3 — 12x 4 -F 12x 6 , 

gix) • /z(x) = 2 + x -F 6x 3 , 

fix) • hix) = 1 — x -F 12x 2 — 6x 3 , 

so that 

deg(/(x) + gix)) = 0, deg(#(x) + Kx)) = 3, deg(/(x) + h(x)) = 3, 
deg(/(*) • g{x)) = 6, deg(^(x) • A(x)) = 3, deg(/(x) • hix)) = 3. 


3-3-6. Proposition. Suppose R is a commutative ring with unity. Then 
i?[x] is an integral domain o R is an integral domain. 

Furthermore, when R is an integral domain D , the units of D[x] are 

precisely the units of D. 

Proof : Suppose R is an integral domain and fix) is a zero-divisor in R[x], 
Thus, /(x) -f— 0, and there exists gix) # 0 in jR[x] such that 

fix) • gix) = 0. 

Therefore, deg(/(x) • gix)) = deg 0 = — oo. On the other hand, deg/(x) > 0, 
deg#(x) > 0; and according to 3-3-5, deg(/(x) • gix)) = deg/(x) -F deg^(x) > 
0, since by the hypothesis on R the leading coefficient of fix) is not a zero- 
divisor. This is a contradiction; so there cannot be any zero-divisors fix) in 
jR[x], and R[x ] is an integral domain. This proves <=. 

As for =>; if jR[x] is an integral domain, then R is a commutative ring with 
unity which has no zero-divisors because it is contained in i?[x]—so R is an 
integral domain. 

Finally, suppose D is an integral domain. If a e D is a unit, there exists 
be D such that ab = 1. But a , b , 1 are also elements of D[x], and 1 is the unity 



264 


III. CONGRUENCES AND POLYNOMIALS 


of D[x ] also, so a is a unit of D[x]. Conversely, suppose f(x) e D[x ] is a unit. 
Then there exists g{x) e D[x] such that 

: f(x)-g(x)= 1. 

Of course, f(x) # 0, g(x) # 0, so that deg/(x) > 0, deg g(x) > 0. Now, 
deg(l) = 0, and deg(/(x) • g(x)) = deg/(x) + deg#(x) because the leading 
coefficients of f(x) and g(x) are not equal to 0 and they come from the integral 
domain D. [In general, the fact that the degree of a product of two polynomials 
is equal to the sum of their degrees is familiar from high school. Its validity 
there is a consequence of the fact that we always dealt with polynomials 
whose coefficients came from Z, Q, R, or C—all of which are integral 
domains. Only when the coefficients come from a ring which is not an integral 
domain is it possible that deg(/(x) • g(x)) # deg/(x) + deg#(x).] Therefore, 

deg/(x) + deg#(x) == 0, 

so that both f(x) and g(x) have degree 0. Consequently, these polynomials 
must be of form f(x) = a , g(x) = b where a and b are nonzero elements of D. 
In particular,/(x) and g(x) are elements of D whose product is 1, so /(x) is a 
unit of D . This completes the proof that the units of D[x] are the same as the 
units of D. | 

In virtue of this result, it is clear that Z 12 [x] is not an integral domain 
because Z 12 is not an integral domain. On the other hand, Z[x] is an integral 
domain (because Z is), and its only units are ±1. Furthermore, Q[x] is an 
integral domain (because Q is an integral domain) and its units are the non¬ 
zero elements of Q (because these are the units of Q). More generally, because 
any field is an integral domain whose units are all its nonzero elements, we 
have: 


3-3-7. Corollary. If F is a field, then F[x] is an integral domain whose set 
of units is F*, the set of all nonzero elements of F. 


So far, we have discussed polynomials; now let us turn to polynomial 
functions. 

3-3-8. Remark. Suppose we are given a polynomial 

oo 

f(x) = a 0 + a t x + ■ • ■ + a n x n = X a,x l 

0 

in R[x]. For any ce R consider the expression 

a 0 + a x c + • • • + a n c n . 
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This involves multiplying and adding elements of R; so it represents an ele¬ 
ment of R , which we write as f(c). For obvious reasons, this procedure may be 
called substitution of c for x. Note that only elements of R may be substituted 
for x. 

There is nothing strange about substitution, it is part of our everyday 
experience. For example, if/(x) = 2 — x + x 3 e Z[x], then/(l) = 2,/(2) = 8, 
/(0) = 2,/(— 1) = 2, and so on. 

Now, suppose that, in addition to /(x) = Zq a i x ' e we have the 
polynomial 

oo 

g(x) = b 0 + b t x + ■ ■ ■ + b m x m = £ b t x l 

0 

in R[x], Let us consider their sum and their product 

s(x) = f(x) + g(x), t(x) = f{x) ■ g(x). 

Expressing these explicitly, we have 

s(x) = (a 0 + b 0 ) + (a! + b t )x + • • • + (a,- + b^x' + • • • 

= E («i + 

0 

and 

t(x) = a 0 b 0 + (a l b 0 + a 0 b^)x + * * * + (a t b 0 + a*- l b 1 + • • • + a 0 bi)x l + * * * 



What happens if c is substituted for x in ^(x) and t(x) ? 

If we consider the concrete situation, /(x) = 2 — x + x 3 e Z[x], g(x) = 
1 + x + x 2 e Z[x], then 

s(x) =/(x) + 0 (x) = 3 + X 2 + X 3 , 
t(x) =fix) • 0 (x) = 2 + x + x 2 + x 4 + x 5 

—and, for example,/(l) = 2 , 0 ( 1 ) = 3, j( 1) = 5, t(l) = 6, so ^(l) = /( 1 ) + g( 1 ) 
and f(l) =/0) -0(1); /(0) = 2 , ^(O) = 1 , j( 0 ) = 3, t( 0 ) = 2 , so s( 0 ) =/( 0 ) 
+ m and t(0) = /( 0 ) • 0 ( 0 ); /(- 1) = 2, g(- 1) = 1, s(~ 1) = 3, /(- 1) = 2, 
so s(— 1) = /( — 1) + g(— 1) and t(— 1) = /( — 1) • g(— 1). More generally, for 
any ce Z, we have f{c ) = 2 — c + c 3 , g(c) = 1 + c + c 2 , s(c ) = 3 + c 2 + c 3 , 
t(c ) = 2 + c + c 2 + c 4 + c 5 —so as expected 


s(c) = /(c) + 0(c) and t(c ) = /(c) • 0(c). 
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Returning to the general situation, it is not hard to see that the same rules 
always hold. In more detail, for any ce R, a trivial computation in R gives 

f(c) + g(c) = (a 0 + a x c + a 2 c 2 + • ■ • + a n c n ) 

+ (b 0 + bi c + b 2 c 2 + * * • + b m c m ) 

= (a 0 + b 0 ) + (a x + b^c + • • * + (flj + b^c 1 + • • • 

= s(c). 

The compact way to write this proof is 

/(c) + g(c) = «i c 'j + (JL b i c ^ 

= Z («i + bi)c‘ 

0 

= 5(C) 

and this is valid because Sq is really a finite sum whenever it appears. 

In the same spirit, we have in the general case 

/(c) • g(c) = • (fi fcici ) 

= Z ( Z «a) 

= t(c ). 

The reader who feels uncomfortable with infinite summation symbols may 
use the explicit finite expressions for f(c),g(c),t(c) to verify that t(c) = 
f{c) • g{c). The key to the matter is that computations with “polynomial 
expressions in c” go just like the same computations with the corresponding 
polynomials in x. 

3-3-9. Remark. If f(x ) e R[x] is given, then for any ce R we can compute 
f(c ) e R. Thus, we have a function or mapping 

c ->/(c), ce R 

of R-+R which is determined by f(x). For want of a better notation, let us 
denote this mapping of R-+ R, which carries c to f(c ) for every c e R, by/. In 
other words, we have defined / e Map(R, R) by the rule 

f(c)=f(c ), ceR. 

Such a function /: R-+R which arises from some polynomial f(x) e R[x] is 
known as a polynomial function over R . Clearly, polynomials and polynomial 
functions are different things. 
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The notion of a polynomial function and the notation for it are probably 
confusing—after all a new and somewhat strange level of abstraction has been 
introduced—so let us do a concrete example. Suppose R = Z 4 and we con¬ 
sider the polynomial /(x) = 2 — 3x + x 3 e Z 4 [x]. What is/ ? It is the mapping 
of Z 4 -» Z 4 [that is, the element of Map(Z 4 , Z 4 )] which maps c-^f{c) — 
2 — 3 c + c 3 for every ce Z 4 ; in other words, 

f(c) = 2 — 3 c + c 3 , ce Z 4 . 

Of course, /, like any other element of Map( Z 4 , Z 4 ), is specified by what it 
does to every element of Z 4 ; thus, letting 0, 1, 2, 3 represent the four elements 
1 0 U , LLU > LiU > UJ 4 Of Z 4 , the mapping fe Map( Z 4 , Z 4 ) is given by 

7(0) = 2. /(I) = 0, /(2) = 0, 7(3) = 0. 

3-3-10. Remark. It is natural to ask whether polynomials and polynomial 
functions are not really different versions of the same reality. Is it true that 
distinct (that is, different) polynomials f(x) and g(x) always lead to distinct 
polynomial functions/and g ? Or, can it be that f(x) # g(x) but yet/ = g ? In 
our past experience, we invariably considered polynomials and polynomial 
functions as the same thing; in fact, one never distinguished between them in 
any way. However, the following simple example shows why this assumption 
is unwarranted in general. (In Section 3-5, we shall see why this assumption 
can be justified in the situations dealt with in high school.) 

Let R = Z p , where p is prime, and consider the polynomials 

f(x) = x p , g(x) = x 

in Z p [x]. Obviously, these are different polynomials^/(x) # g(x). On the 
other hand, we assert that/ = g. To prove that the elements/, g e Map( Z p , Z p ) 
are the same, we must show that/(a) = g(oc) for every a e Z p . But for a e Z p , 
we have 

/(a) =/( a) = a p , g( a) = g( a) = a 

and these are equal in virtue of Fermat’s theorem, 2-4-9. Thus, distinct 
polynomials can indeed lead to the same polynomial function. 

There is more to be said about the connection between polynomials and 
polynomial functions. To any polynomial f(x) e R[x] there is associated the 
mapping /: R-+ R [that is, the element / e Map(R, R)] that is given by 
f(c ) = f(c) for all c e R. Thus, we can define a mapping 

A: R[x] -> Map(R, R) 


by putting 


A(/(x)) =7 f(x) e jR[x]. 
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In other words, A is the mapping which carries each polynomial /(x) to its 
associated polynomial function /. Now, in 2-2-12 we saw that, for any set 
X , Map(A", R ) becomes a ring when the operations are defined in an appropri¬ 
ate manner. In particular, in the case X = R, Map(/?, R) becomes a ring in 
which the operations take the following form: If (j) and ij/ are elements of 
Map(jR, R ), then (j) -f i/i and 0 • \j/ are the elements of Map(i?, R) defined by 

((j) + i l/)(c) = 4>{c) + i//(c) 9 
‘ <A)(0 = 4>(c) • ij/(c) 9 


3-3-11. Proposition. The mapping 

A: i?[x] Map(i?, R) 


given by 


A(/(x))=/, /(x) e /?[x] 

is a homomorphism of rings. 


Proof: Given any /(x), g(x) e R[x ], let us write ^(x) = /(x) + g(x) and 
t(x) = /(x) • #(x). We need to show that 

A(/(x) + g(pc)) = A {f{x)) + A {g{x% 

A(/(x)^(x)) = A(/(x)) • A(>(x)). 

In connection with the first of these, we observe that 

A (fix) + g(x)) = A(j(x)) = s and A(/(x)) + A(g(x)) =J+g. 

It remains to verify that s = /+ g, and because we are dealing with elements 
of Map(i?, R) this requires verification that s(c) = (/ + g)(c) for every ce R. 
But this is not hard. 


s(c) = s(c) 

= f(c) + 9(c) 
= f(c) + g(c) 
= (/ + 9)(c) 


(by definition of s), 

(by 3-3-8), 

(by definition of / and g ), 

(by definition of + in Map(i?, R)). 


In similar fashion, A preserves multiplication; in fact, A(f(x)g(x)) = 
A(*(x)) = f, A(/(x)) • A(#(x)) =J-g , and l—f'g because, for every ce R t 
we have t(c ) = *(c) = f(c) • g(c) =f(c ) • g(c) = (/• g)(c). This completes the 
proof. | 

There are two kinds of problems about polynomials around which our 
concern will center for the rest of this chapter: solving them and factoring 
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them. We conclude this section with some preliminary comments about these 
questions. 

3-3-12. Definition. The element ce R is said to be a root of the polynomial 
/(x) = a 0 + + • • • + a n x n e jR[x] when f(c) = 0. 

Other terminologies used for this notion are: c is a solution of the equation 
/(x) = a 0 + a x x + • • • + a n x n = 0; c is a place where the polynomial /(x) (or 
the polynomial function /) takes the value 0—or simply, c is a zero of the 
polynomial /(x) (or of the polynomial function /). One may also take a 
“geometric” view. In fact, the set 

G(f) = {(r,Ar))\reR} 

is obviously a set of points in the direct product R x R. In analogy with the 
familiar situation in R x R (which is just another way of denoting the Euclid¬ 
ean plane), we may call G(f) the graph of/(x)—and then a root of/(x) is a 
value of the unknown (that is, indeterminate) x where the graph crosses the 
horizontal axis. 

It is important to emphasize the ring R from which the coefficients of 
/(x) come—for according to our definition, the roots of /(x) can only come 
from jR. For example, consider the polynomial 

f(x) = 3x 3 — 4x 2 + 2x + 3. 

If we view f(x) e Z[x], then/(3) = 54, so 3 is not a root of/(*). On the other 
hand, if we view f(x) e Z 6 [x], then /(3) = 0 and 3 is a root of f(x )—more 
precisely, here f(x) is really L_3j 6 * 3 — |_4_| 6 ^ 2 + \^]e x + lAb ( an d we are 
keeping the notation simple by dropping |_J 6 throughout) so that /(|_3_| 6 ) = 

1 54 [ 6 = |_0j 6 , and this is written as/(3) = 0. 

When R is a commutative ring with unity, so is i?[x], and therefore (as 
indicated in 3-2-1) it is meaningful to talk about divisibility or factorization in 
R[x]. The connection between a root of the polynomial f(x) and its factoriza¬ 
tion is given by: 


3-3-13. Factor Theorem. Given f{x) e i?[x]; then c e R is a root of f(x) o 
(x — c) divides /(x). 


Proof: Let us prove <= first. If (x — c) divides /(x), then there exists 
g(x) e jR[x] such that 


f{x) = (x - c)g(x). 
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By the properties of substitution, 3-3-8, we have fic) = (c — c)g(c) = 0, so 
c is indeed a root of fix). 

As for the implication =>, let us write 

fix) = Uq + d±X d 2 x 2 + • • • + d n X?. 

Since c is a root of fix), we have 

fic) = d 0 + d^ + d 2 c 2 + - - • + d n c n = 0. 

Because everything in sight belongs to i?[x], we may then write 

fix) =fix) -fic) = d t (x - c) + d 2 ix 2 - c 2 ) + • • • + dfx n - c n ). (*) 

Now, in i?[x], we clearly have x 2 — c 2 = (x — c)(x + c ), (x 3 — c 3 ) = 
(x — c)(x 2 + cx + c 2 ), and for arbitrary i > 2 

(x l — c l ) = (x — c)(x 1-1 + cx l ~ 2 + • • • + c l ~ 2 x + c l ~ l ). 

Thus, (x — c) is a factor of each term on the right side of (*), so (in virtue of 
the generalized distributive law in R[x]) (x— c) is a factor of fix) —or, in 
other words, (x — c) divides fix). This completes the proof. | 

3-3-14. Example. Consider the polynomial /(x) = x 2 — 1 e Z 15 [x]. Its 
roots, if any, must come from Z 15 —so the only possibilities are 0, 1, 2,..., 14, 
where according to our convention each such integer d represents the element 
1 d | 15 e Z 15 . When each of these 15 possibilities is tested, we find that 1,4, 11, 
14 (meaning |_J_| 15 , |_4j 15 , | H | 1S , | 141 15 ) are the roots of fix). This is sur¬ 
prising perhaps, since we normally expect a polynomial of degree 2 to have 
at most 2 roots. However, having seen in Section 3-1 that a linear equation 
ax = P in Z m may have more than one root, our surprise here should not be 
very great. 

Since 1, 4, 11, 14 are roots, 3-3-13 tells us that (x — 1), (x — 4), (x — 11), 
(x — 14) all divide x 2 — 1 in Z 15 [x]. In fact, we have in Z 15 [x] 

x 2 — 1 = (x — l)(x — 14) = (x — 4)(x — 11). 

Here we have two distinct factorizations, neither of which can be factored 
further. Thus, factorization in Z 15 [x] need not be unique. 

Incidentally, in Z 15 , 14 = — 1 and 11 = — 4, so we also have the factor¬ 
izations 


x 2 - 1 = (x - l)(x + 1) = (x - 4)(x + 4) 


—but these factorizations are really the same as the preceding ones. 
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3-3-15 / PROBLEMS 

1. (0 Find the sum, difference, and product of the polynomials f(x) = 

3 + 2x — x 2 — 3x 3 and g(x) — 3x 2 -f 2x 3 -f x — 1 in Z[x]. 

(ii) Perform the same operations when f(x) and g(x) are viewed in 
(a) ZJx], (b) Z 5 [x], (c) ZJx], (d) ZJx], (e) Z n [x]. 

2. With the usual understanding that R is a commutative ring with unity, 
provide the missing details in the proof (see 3-3-3) that R[x] is a commuta¬ 
tive ring with unity. 

3. In connection with 3-3-4, explain carefully why 1 = lx° is the unity of 
R[x]. 

4. Suppose ce R and f(x) = a 0 + a x x -f • • • + a n x n e jR[x], How do we know 
that 


cf(x) = (ca 0 ) + (ca t )x + • • • + (ca^x?! 

5. In i^[x], what can be said about xa for aeRl More generally, given 
a, be R what is the meaning, if any, of ax l bx j ? 

6. When R is a commutative ring with unity, which of the following are 
subrings of R[x] ? 

(i) all polynomials with constant term equal to 0, 

(ii) all polynomials with constant term equal to 1, 

(iii) all polynomials of degree less than or equal to 7, 

(iv) all polynomials of degree equal to 7, 

(v) all polynomials in which the even powers of x all appear with 
coefficient 0 (x° is considered to be an even power), 

(vi) all polynomials in which all the odd powers of x appear with 
coefficient 0, 

(vii) all polynomials for which the coefficients of both x and x 2 are 0. 

7. Which of the following belong to Z[x] ? 

(0 1/x 2 , (ii) x" 1 , (iii) y]x. 

8. Is it true that in any polynomial ring i?[x] we have 

(x - IK*"" 1 + x"" 2 + • • • + x +1) = x" - 1 ? 

9. How many polynomials of degree 4 are there in Z 3 [x]? How many of 
degree 5 ? How many of degree less than or equal to 5 ? 

10. For integers m > 0, n > 2, how many polynomials are there in Z„[x] of 
degree exactly m ? How many of degree less than or equal tom? 
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11. Since Z 6 is not an integral domain, we know that Z 6 [x ] is not an integral 
domain. Which of the following elements of Z 6 [x] is a zero-divisor? 
Justify your answers. 

(i) 2x 9 (ii ) x -F 5, (///) x + 2, (zi?) 2x + 4, 

00 5x — 1, (z?z) lx -F 3, (vii) 3x — 2. 

12. Can you find polynomials f(x) and g(x) in Z 12 [x] such that deg f(x) = 
deg g(x) = 5, deg(/(*) + g(x)) = 3, and deg(f(x)g(x)) = 8? 

13. Prove: If x and y are indeterminates over R , then the rings J?[x] and R[y] 
are isomorphic. Even more, if both R and R ' are commutative rings with 
unity which are isomorphic, x is an indeterminate over R , and >> is an 
indeterminate over R\ then the rings R[x] and R'[y] are isomorphic. 

14. (z) Showthatthepolynomials f x (x) = x,/ 2 (x) = x 3 ,/ 3 (x) = x 6 + 2x 2 + x, 

/ 4 (x) = x 9 + 8x 3 + x in Z 3 [x] all determine the same polynomial 
function. 

(ii) What happens if they are viewed as polynomials in Z 5 [x]? 

15. (z) Consider the polynomial /(x) = x 3 —x + 2e Z 3 [x]. Describe the 

polynomial function/: Z 3 Z 3 that it determines concretely—that 
is, by exhibiting/(0),/(l),/(2). Find two other polynomials in Z 3 [x] 
which determine the same polynomial function. 

(ii) Answer the same questions as above when Z 3 is replaced by 
(a) Z 5 , (b) Z 7 , (c) Z n . 

16. Can you find a polynomial f(x)e Z 3 [x] whose polynomial function 
/ e Map( Z 3 , Z 3 ) is given by 

/(0)=1, /(l) = 2, /(2) — 1 ? 

17. (z) Show that Map( Z 3 , Z 3 ) has 27 elements, as does the set Z 3 [x] of all 

polynomials in Z 3 [x] of degree less than or equal to 2. 

(ii) Consider the mapping A: f(x) ->j of 

Z\[x] -> Map( Z 3 , Z 3 ). 

Is it one-to-one and onto ? 

18. Find the roots of/(x) = 3 + 2x — x 2 — 3x 3 when it is viewed as a poly¬ 
nomial over 

(0 Z 4 , (//) Z 5 , (iiz) Z 6 , (iv) Z 7 , (v) Z n . 

Do the same for g(x) = 3x 2 + 2x 3 + x — 1. 

19. Does (x - 3) divide x 4 + x 3 + x + 4 when they are viewed as poly¬ 
nomials over 

0) Z, 00 Q> 0*0 Z 3 , (iv) Z 4 , 

(v) Z 5 , (vi) Z 6 , (vii) Z 7 , (i?z # zz) Z n ? 



3-3. POLYNOMIALS AND POLYNOMIAL FUNCTIONS 273 


20. Fix any ce R where, as usual, R is a commutative ring with unity. Define 
a function 

E c i i?[x] —> R 

by putting 

E c {f(x))=f(c), fix) 6 R[x). 

Show that E c , which is called the evaluation function at c , is a surjective 
homomorphism. What is the kernel of E c °i When is E c = E c , ? 

21. ( i ) Suppose D = P u {0} u — P is an ordered domain. Show that D[x] 

becomes an ordered domain when we take as the set of positive ele¬ 
ments those polynomials whose leading coefficient belongs to P . 

(ii) In this way, Z[x] becomes an ordered domain which contains a 
least positive element—namely, 1. However, Z[x] is not well 
ordered. 

(i«) Show that D[x ] also becomes an ordered domain when an element 
/(x) = a 0 + a t x + • • • + a n x n e D[x\ is defined to be positive when 
the first nonzero coefficient (starting from the left) is in P (that is, is 
positive). Thus, in virtue of part (/), the same domain D[x] can be 
made into an ordered domain in more than one way (that is, with 
different sets of positive elements). 

22. Show that Z 3 [2x] = Z 3 [x]; in other words, every polynomial in 2x 
(really, [^J 3 x, of course) over Z 3 can be expressed as a polynomial in 
x over Z 3 , and conversely. 

23. (i) Suppose R is a commutative ring with unity and c is a unit of R; then 

R[x] = jR[c*x]. 

(ii) If, in addition d is any element of R , then R[x] = R[cx + d\ 

(/ii) Is jR[x 2 ] equal to i?[x] ? 

24. If f(x) = a 0 + a x x + a 2 x 2 + • • • + a n x n e i?[x], let us call 

f'(x) = a t + 2a 2 x 1 + 3a 3 x 2 + • • • + na„ x” -1 e 7?[x] 
the derivative of/(x). In compact notation 

/(*)= ia iX l , f'(x)= Y ia iX i_1 . 

i = 0 i= 1 

Prove the usual rules for derivatives—namely. 

(0 (/(*) + =f'(x) + g'(x), 

(ii) {cf{x))' = cf\x), 

(iii) (f(x)g(x))' =f(x)g'(x) +f'(x)g(x), 

(iv) [(f(x)) n Y = n(f(x)f- l f'(x), 
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25. If R is a commutative ring with unity, let denote the set of all 

“formal power series” in x over R. In other words, the elements of 
R[[x]} are of form 

oo 

a 0 -f a t x -f a 2 x 2 -f * * • = 

i= o 

where there are no restrictions on the elements a { of R . Defining addition 
and multiplication in jR[[x]] just like for ordinary polynomials, show that 
R[[x]] becomes a commutative ring with unity which contains R[x] as a 
subring. Which are the units of /?[[*]] ? If R is an integral domain D , show 
that £>[1X1] is an integral domain. 

3-4. Factorization in F[x] 

Consider a field F and the polynomial ring F[x]. According to 3-3-7, 
F[x] is an integral domain whose set of units is F*, the set of all nonzero 
elements of F. In other words, a polynomial u(x) e F[x] is a unit o u{x) is a 
constant polynomial of form u{x) = u where u e F*. [Sincedeg(O) = — oo, we 
may also say that u(x) is a unit <=> deg(w(x)) = 0.] Our main objective in this 
section is to study questions of divisibility and factorization, in F[x]. We shall 
see that as far as these questions are concerned F[x] behaves very much like 
Z —except that the units complicate matters somewhat. 

Let us examine the role played by units in questions of divisibility. As an 
illustration, consider the polynomials 

f{x) = (3 - V2)* 2 + (1 - S)x + (1 + Vi) 

and 

g(x) = (1 + 2\Jl)x 2 - x + (3 + 2 V 2 ) 

in R[jc]. It is easy to see that, although these polynomials appear to be un¬ 
related, they actually divide each other. In fact, if we take u(x) = u = 1 + 
and v(x) — v= —1 + \jl then g(x) = f{x)u [which says that f(x) \g(x)\ and 
f(x) — g(x)v [which says that g{x)\f{x)\. Of course, u = 1 + V2 and v = 
— 1 + \Jl are nonzero elements of R, so they are units of R[x]; even more, u 
and v are inverses of each other— 

This situation is a manifestation of the following general result. 
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3-4-1. Proposition. Suppose fix) and g(x) are nonzero polynomials in 
F[x ]; then the following conditions are equivalent: 

(1) f(x) and g(x) divide each other, 

(2) there exists a unit u(x) — u of F[x] such that g(x) = f(x)u , 

(3) there exists a unit v(x) = v of F[x] such that f(x) = #(x)i;. 


Praq/*: (1) => (2): Since fix) | g(x) there exists u(x) e F[x] for which g(x) = 
f(x)u(x) 9 and since g(x) \f(x) there exists v(x) e F[x] for which fix) = g(x)v(x). 
Consequently, 

f(x) =f(x)u(x)v(x). 

Because F[x] is an integral domain, the cancellation law applies—so the non¬ 
zero element fix) may be canceled to give 

1 = u(x)v(x). 

Thus, u(x) is a unit of F[x ]; so u(x) = w, an element of F*, and g{x) = f(x)u. 

(2) =>(3): Given g(x) = f(x)u with u = u(x) a unit of F[x]. Since ue F* 
there exists v e F* such that uv = 1. Thus, v = i?(*) is.^ unit of F[x] and 

^r(x)i; =f(x)uv =f(x). 

(3) =>(1): Given f(x) = g(x)v with v = i;(x) a unit of F[x], we see that 
g(x) | f{x). Upon taking u = u(x) for which vu = 1, we obtain f(x)u = g(x) so 
that f{x) \g(x). This completes the proof. | 

3-4-2. Remark. When condition (2) is satisfied—that is, when g(x) = 
f(x)u with u = u(x) a unit of F[x ]—it is customary to say that g(x) is an 
associate of f(x), and to denote this by g(x) ~ f(x). With this notation, the 
assertion of 3-4-1 takes the form: g(x) ~f(x) o f{x) ~ g{x) o f{x) and 
g(x) divide each other. 

It is easy to see that ~ satisfies the following properties: 

(0 /(*) ^[take u = u{x) = 1]. 

07) If g(x) -/(x), then f(x) - g(x). 

(Hi) If f(x) ~ gix) and gix) ~ /z(x), then fix) ~ hix). 

Of course, (fi7) is immediate because if fix) and gix) divide each other and also 
gix) and hix) divide each other, then fix) and hix) divide each other. 

These three properties, which we have already encountered on a number of 
occasions, are known as the reflexive, symmetric, and transitive properties (or 
laws), respectively. They occur within a very general framework which we 
shall now discuss, even though it is not essential to our needs. 
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3-4-3. Digression. Consider an arbitrary set X = {a, b, c,..., x,y, z, ...} 
and suppose we are given a relation p on X. We do not wish to be overly fancy 
or precise about the meaning of a “relation.” Suffice it to say that for any 
pair of elements a, b , in X (in the given order) either a is related to b with 
respect to the relation p—in which case we write apb —or else a is not 
related to b with respect to p—in which case we write apb. In other words, 
for any given pair of elements a, b of X the statement apb is either true or it is 
false (but not both!). 

Let us list some examples of the notion of relation. 

(1) Let X be the set R of all real numbers; for a, be R, apb is taken to 
mean a < b. 

(2) Let A" be the set of all triangles in the plane; for a, b e X, let apb have 
the meaning that the triangle a has the same area as the triangle b. 

(3) Let X be an integral domain D; for a, b e D , apb means a divides b. 

(4) Let A" be a polynomial ring F[x\; for /(*), g(x) e F[x] we define 
fix) p g(x ) to mean deg/(x) = degp(x). 

(5) Let X = Z and p be congruence (mod m). In other words, when m > 1 
is fixed and a , b.e Z, we write apb to signify that a = 6(mod m). 

(6) Let X be any ring R and fix a subring S; for a,beR take apb to 
mean a — be S. 

The most interesting relations are those which satisfy the reflexive, sym¬ 
metric, and transitive properties. In more detail, any such relation p is known 
as an equivalence relation, and it satisfies 

(i) apa for all a e X. 

(ii) If apb , then bpa. 

{Hi) If apb and bpc , then ape. 

Of the examples listed above, it is clear that (2), (4), (5), (6) are equivalence 
relations while (1) and (3) are not. 

The importance of equivalence relations derives from the following facts. 
If p is an equivalence relation on the set X , then [in analogy with what was 
done in 1-7-8 for congruence (mod /«)], for every aeX we write 

\a\ p = {beX\bpa) 

and call it the equivalence class of a (with respect to p). By the same kinds of 
arguments used in 1-7-10, keeping in mind that b e |jaj p o b p a, it follows 
that the equivalence classes satisfy the properties: 

(/) a e a^ p for all a e X, 

(ii) 6e[oj p o ae[6| p , 

(Hi) * e [«J p ^ [*_| p = [£j p , 

(iv) [«Jp n lijp* 0^l£] P =l^L 
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Consequently, each element of X belongs to some equivalence class, and any 
two equivalence classes are disjoint or identical. We conclude that the set of 
equivalence classes with respect to p provides a disjoint decomposition (or 
partition) of X, 

Returning to F[x ], we note that in virtue of 3-4-2 and 3-4-3 the relation 
p = ~ (where both p and ~ signify “is an associate of”) is an equivalence 
relation. In truth, as may be seen from 3-4-1 and 3-4-2, the relation ~ is 
defined only for nonzero polynomials (that is, on the set F[x] — {0}, obtained 
by removing the zero polynomial from F[x ]); however, we shall be careless and 
say that ~ is an equivalence relation on F[x], instead of insisting on F[x ] — {0}. 
Of course, any equivalence class is of form 

|/(*)L = {#(*) e ~/(*)} 

where f(x) ^ 0, [so g(x) e I/O) L <=> g(x) ~/0) <=> g(x) = /O) times a unit] 
and the equivalence classes provide a disjoint decomposition of F[x] — {0}. 

Our next result indicates that divisibility is really a question about 
equivalence classes, but we prefer to express it in less abstract terms. 


3-4-4. Proposition. Given nonzero polynomials f(x ), g(x)eF[x], then, 
f(x)\g(x) o any associate of f(x) divides any associate of g(x). More 
precisely, if/'(*) is an associate of f(x) and g\x) is an associate of g(x): 
then 

fix) I ff(x) o fix) | g'(x). 


Proof : By hypothesis, we have f\x) = /(x)w, g'(x) = g(x)v where u = u(x) 
and v = v(x) are units of F[x] (that is, w, v e F*). Both u~ i and v~ i exist and 
are units, so we also have f{x) = /'(*)w -1 , g(x) = g'(x)v~ 1 . Now: f(x) \g(x) => 
there exists h(x) e F[x] with g(x) = f(x)h(x) ^>g'(x) = g(x)v = f(x)h(x)v = 
(.f{x)u)(u~ x h{x)v) — f'(x)(h(x)u~ i v ), which says that f'(x) \g\x). The proof in 
the other direction goes exactly the same way. | 

In the situation considered above, any element of \f(x) | „ divides any 
element of | g(x) so it would be meaningful to say that \f(x) divides 
It is simpler, however, to work with a specific representative (chosen 
according to the next result) of each equivalence class to settle questions of 
divisibility. 


3-4-5. Proposition. Every nonzero polynomial is an associate of a unique 
monic polynomial. In other words, every equivalence class of F[x] con¬ 
tains exactly one monic polynomial. 
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Proof : Write the given nonzero polynomial explicitly as f(x ) = 

a 0 + a t x + a 2 x 2 - 1-h a n x" with a n # 0. Since a n e F\ it has an inverse 

a” 1 e F*, and a" 1 is a unit of F[x], (In familiar situations such as F equals Q 
or R we would write l/a„ for a~ l .) Then the polynomial 

/'(*) = a» _1 /W = /OK -1 

= (a 0 a„ _1 ) + KK 1 )* + ••• + K-iK 1 )*" -1 + x n 

is monic and it is an associate of f(x). 

As for uniqueness, suppose /"(*) is another monic associate of fix). Then 
/"(*) is an associate of f'{x) and we have 

fix) =f(x)u 

for some u e F*. Of course, both f'(x ) and /"(*) have the same degree, n —and 
by comparing the leading coefficients of /"(*) and f'{x)u , we see that u = 1. 
Thus, there is indeed a unique monic associate of fix). 

To prove the second version (which is really just a rephrasing of what has 
already been proved) we observe that every equivalence class is of form 
1 fix) | „ with f{x) # 0. As above, let f'{x) be the unique monic associate of 
fix). In general, for gix) e F[x ], we have: 

gix) is a monic polynomial in |/C*0l ^ &( x ) ls a monic associate of fix). 

It follows that f\x) is the unique monic polynomial in | fix) | ~, and the proof 
is complete. | 

So far, in this section, we have been dealing with peripheral or preliminary 
results. The heart of the matter for us is to study factorization in F[x ] and to 
show that it is “essentially unique.” The problem of factorization is not a 
strange one. In Chapter I, we made a detailed study of the integers Z, and 
proved that every nonzero integer has an essentially unique factorization into 
primes. Moreover (as the reader may convince himself by a careful examina¬ 
tion of the structure of the discussion in sections 1-2, 1-3, 1-4), in the final 
analysis, the key result, on which the entire edifice of uniqueness of factoriza¬ 
tion rests, turns out to be the division algorithm. We shall see that the same 
kind of situation occurs in F[x] —namely, there is a division algorithm in 
F[x], and one eventually derives uniqueness of factorization from it. 


3-4-6. Division Algorithm. Given polynomials fix) ^ 0 and g{x) in F[x], 
there exist unique polynomials qix) and r(x) in F[x ] such that 

gix) = qix)fix) + r(x) where deg r(x) < deg/(x). 
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Proof : Note that in view of our convention about the degree of the zero 
polynomial being — oo (deg (0) = — oo), the condition deg r(x) < deg/(x) is 
really another way of saying that: either r(x) = 0, or else, when r(x) ^ 0 its 
degree is less than the degree of /(x). Naturally, q(x) and r(x) are known as 
the quotient and remainder, respectively. 

The proof is not hard but somewhat formal, so let us indicate how it goes 
by a concrete example. Consider f(x) = 3x 2 — x -f 1 and g(x) = 2x 4 -f 5x 3 
- x 2 + 2 in R[x], If we multiply f(x) by Jx 2 , the result is 

(}x 2 )f(x) = 2x 4 - fx 3 + fx 2 

which enables us to eliminate the leading term, 2x 4 , of g(x) by subtracting 
(i x 2 )f( x ) —we have then 

g(x) - (|x 2 )/(x) = Vx 3 - fx 2 + 2 

or, what is the same, 

g(pc) = (|x 2 )/(x) + (Vx 3 - fx 2 + 2). 

(We observe in passing that this cannot be done when f(x) and g(x) are viewed 
in Z[x], since J- is not an element of Z; however, because R is a field, the steps 
we have performed are permissible; similarly, because Q is a field, the same 
procedure is applicable when f(x) and g(x) are viewed in Q[x].) 

If we put g'(x) = V * 3 — -jx 2 + 2, then 

g(x) = (fx 2 )/(x) + g\x) (*) 

and the same process can be applied to f(x) and g\x). More precisely ( l ^x)f(x) 
has the same leading coefficient as g'{x) —and upon subtraction we obtain 

g'(x) - (V 7 *)/(*) = i xl ~ 1 9 x + 2 - 
Denoting the right side by g"(x ), we have 

9\x) = W x )m+!f{x) 

and this may be substituted in (*) to give 

g(x) = (fx 2 + V 7 *)/(*) + 9\x). (**) 

Obviously, deg g\x) > deg g\x ), and each time our process is repeated the 
degree of the 66 left-over term ” (meaning g\x ), g"(x ),...) diminishes—so 
eventually one obtains a left-over term of degree less than deg/(x) = 2. In 
fact, here we have 

= -nx + u 

—so calling the right side g"’( x ) yields 

g"{x) = 2 t/W + 9"\x) 
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and then 


g(x) = (fx 2 + V 7 x + 2 2 t )/(x) + g"'(x). (***) 

Because deg #'"(*) = 1 < 2 = deg/(x), we put r(x) = g'"{x) and #(x) = 
-fx 2 + V 7 x + thus getting 

g(x) = q(x)f(x) + r(x), deg r(x) < deg/(x). 

Hence, we have shown explicitly how the division algorithm is valid for the 
given polynomials/(x) and g(x). Of course, the familiar process of “long 
division” is designed as a compact, efficient method for carrying out the 
computations used above, and thus finding q(x) and r(x)—in fact, 

2 .v 2 _i_ IJv _i_ _ 2 _ 

3 X -t- 9 At 27 

3x 2 — x + l|2x 4 + 5x 3 — x 2 + Ox + 2 
2x 4 — fx 3 + fx 2 i 

y^ 3 - |x 2 + ox ; 

17 v 3 _ 1_7 V 2 , 1_7 V i 

3 a- 9 •A' i p . 

j-x 2 — yx + i 

_2 v 2 _ 2 _ , 2 

9 A 27 A ' 27 

49 v , 5 2 

2 7 A 27 

Now, let us give a fairly precise formal proof of the division algorithm in 
its full generality; we do the existence part first—namely, given/(x) ^ 0 and 
g(x) in F[x] we show that there exist q(x ), r(x) e F[x] such that 

g(x) = q(x)f(x) + r(x), deg r(x) < deg/(x). (*) 

If deg/(x) = 0, then/(x) = a 0 with a 0 # 0, so upon putting q(x) = x g(x) 
and r(x) =0 we see that (*) holds. Therefore, we may assume, henceforth, that 
deg/(x) > 1. 

Suppose next that deg^(x) < deg/(x) (this always includes the case 
where g(x) = 0); if we then put q(x) = 0 and r(x) = g(x ), then (*) clearly 
holds. Therefore, we may assume, henceforth, that deg#(x) > deg/(x) > 1. 

Let us write n = deg#(x), m = deg/(x)—so n >m> 1—and proceed by 
induction on n. For each n > 1, let n(ri) be the following statement: 

, v fif/(x) ^ 0 and deg#(x) = «, then there exist 

^ \q (*)> r(x) e F[x] such that (*) holds. 

It is not hard to verify that 7i(l) is true. In fact, if n — 1, then m = 1, so 
g(x) and /(x) take the form 

g(x) = b 0 + b x x, # 0, 

/(x) = a 0 + a^Xy # 0. 
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Putting q(x ) = b^ 1 and r(x) = b 0 — b^a[ x a^ , we see immediately that (*) 
holds. Hence, 7r(l) is true. 

Now, suppose inductively that n(k) is true for k = 1,21, and 
consider the polynomial g(x) of degree n. We may write 

g(x) = b 0 + b x x+ - •- + b n xT, b n # 0, 

fix) = a 0 + a 1 x+--- + a m xT, a m # 0, 

where, as usual, n > m > 1. Clearly, {b n a~ 1 tf l ~ m )f{x) is a polynomial whose 
leading term is b n x n , so upon subtracting it from g(x) we obtain a polynomial 

g'{x) = g(x) - (b n a~ i x n ~ m )f(x) 

whose degree is less than or equal to n — 1. If g\x) = 0, then taking q{x) = 
b n a~ 1 ^ t ~ m and r(x) = 0 gives the validity of (*). If deg#'(*) = 0, then we may 
take q(x) = b n a~ 1 x n ~ m , r(x) = g'{x)\ and (*) holds because deg/(x) > 1. It 
remains to consider the (most common) case where deg g\x) > 1 (and deg g'{x) 
is still less than or equal to n — 1). By the induction hypotheses, there exist 
q'{x ), r'(x) e F[x] such that 

g\x) = q\x)f{x) + r\x\ deg r\x) < deg/(x). 

Therefore, 

9 (x) = \b n a~ l x n ~ m + q'(xy]f(x) + r'(x), 

so if we put q(x) = b n affy?~ m + q'(x ) and r(x) = r\x ), then (*) holds for g{x). 
Thus, n{ri) is also true. Because n(n) is now true for every n = 1, 2, ..., the 
existence proof for the division algorithm is complete. 

To prove uniqueness for the division algorithm, we suppose (just as was 
done in Z; see 1-2-1) that there are two versions—say 

g(x) = (h(x)f{x) + deg r^x) < deg f{x), 

ff(x) = q 2 (x)f(x) + r 2 (x), deg r 2 (x) < deg/(x). 

Consequently, 

(<h(x) - q 2 (x))f(x) = r 2 (x) - r t (x). (#) 

Applying the rules for degrees of polynomials (see 3-3-5) we have 
deg(r 2 (x)— r t (x)) < deg/(x). 

On the other hand, since f(x) ^ 0, its leading coefficient, which is a nonzero 
element of F , is surely not a zero-divisor, so we have 

deg((^(x) - q 2 (x))f(x)) = deg(^(x) - q 2 {x)) + deg/(x). 
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Therefore, 

deg(tfi(*) - q 2 (xj) + deg/O) < deg/(x) 

and 


deg(<7 x (X> ~ <h(x)) < 0. 

This tells us that qfx) — q 2 (x) = 0,and then from (#),r 2 (x) — rfx) = 0. Thus, 
qi(x) = # 2 (*)> r i( x ) = r i( x )> and the division algorithm is unique. | 

Once the division algorithm has been proved, the discussion of factoriza¬ 
tion in F[x] proceeds along a familiar path; it is completely parallel to the 
discussion of factorization in Z, as given in Chapter I. It would be perfectly 
feasible, at this point, to challenge the reader to provide the details leading to 
uniqueness of factorization* into primes in F[x]. We shall, however, take a 
less risky approach—proofs will be sketched briefly, but definitions and 
statements of results will be given carefully. 


3-4-7. Euclidean Algorithm. Given/(*), g{x) e F[x] with f(x) ^ 0, we have 


ff(x) = qi(x)f(x ) + ^(x), 
fix) = q^xy^x) + r 2 (x), 
ri(x) = qfx)r 2 {x) + r 3 (x), 


deg r t (x) < deg f(x), 
deg r 2 (x) < deg r t (x), 
deg r 3 (x) < deg r 2 (x), 


r i-iix) = 4;(*>Vi(x) + r t (x), deg r t (x) < deg 


r„- 2 (x) = <7„(*K-i(*) + r„(x), deg r„(x) < deg 
r n -i(x) = q n+l (x)r„(x) + 0. 


Proof: The only point at issue is that by repeating the division algorithm, 
as indicated above, one reaches a remainder of 0 after a finite number of steps. 
Suppose this is false; then for each /= 1,2, 3,... the remainder r t (x) is 
nonzero and hence has degree greater than or equal to 0. Thus, we have an 
infinite sequence of polynomials r x (x), r 2 (x), ... for which 

deg r t (x) > deg r 2 (x) > • • • > deg rfx) > deg r i + 1 (x) > • •; > 0. 

This is clearly impossible. | 

The notation is arranged so r n (x) denotes the last nonzero remainder. All 
the preceding remainders r t (x), r 2 (x ),..., r n _i(x) are also nonzero. Note, 
incidentally, that if we ever arrive at a remainder r t .(x) of degree 0—that is, 
fi(x) = c # 0 in F —then this r^x) ends up being r n (x) because r t (x) = c, being 
a unit of F[x ], surely divides r^^x), so the next remainder is 0. 



3-4. FACTORIZATION IN F[x] 


283 


3-4-8. Examples. (1) Consider the polynomials f(x ) = 12x 3 + 16x 2 — 3x, 
g(x) = 12x 4 + 4x 3 — I3x 2 + 14x + 3 in R[x]. Their Euclidean algorithm takes 
the form 

I2x 4 + 4x 3 - \3x 2 + \4x + 3 

= (x — l)(12x 3 + 16.x: 2 — 3x) -f (6x 2 + llx + 3), 

12x 3 + 16x 2 — 3x = (2x — l)(6x 2 + 1 lx + 3) + (2x -f 3), 

6x 2 + 1 lx + 3 = (3x + 1)(2jc + 3), 

and it is obtained by use of long division. In the notation of 3-4-7, we have: 
q x (x) = (x- 1), r x (x) = 6x 2 + 1 lx + 3, q 2 (x) = (2x - 1), r 2 (x) = (2jc + 3), 
q 3 (x) = (3x + 1). 

(2) Consider f(x) = 2x 3 + 5x 2 + 5x + 3 and g(x) = 4x 3 + 3x — 3 as 
polynomials in Z 7 [x]. By long division, and keeping in mind that all coeffi¬ 
cients come from Z 7 , we obtain for the Euclidean algorithm 

4x 3 + 3x — 3 = 2(2x 3 + 5x 2 + 5x + 3) + (4x 2 — 2), 

2x 3 + 5x 2 + 5x + 3 = (4x + 3)(4x 2 — 2) + (6x + 2), 

4x 2 — 2 = (3x — l)(6x + 2). 

In particular, 6x + 2 (which is the same as — x — 5) is the last nonzero re¬ 
mainder. 


3-4-9. Definition. Let nonzero polynomials f(x), g(x) e F[x] be given. A 
polynomial d(x) e F[x] which satisfies 

(0 d(x) | f(x) and d{x) \ g(x) 

is said to be a common divisor of f(x) and g(x). If, in addition, d(x) satisfies 
the conditions, 

(«) if c(x) | f(x) and c(x)\g(x) 9 then c(x) | d(x), 

(iii) d(x) is monic, 

then d(x) is said to be a greatest common divisor of f(x) and g(x). 


3-4-10. Proposition. Given nonzero polynomials f(x) 9 g(x) e F[x] 9 then: 

(/) Their greatest common divisor exists and is unique—we denote it by 
d(x) = (f(x), g(x)). 

( ii ) We can compute d{x) via the Euclidean algorithm; more precisely, 
if r„(x) is the last nonzero remainder in the Euclidean algorithm, 
then its unique monic associate (which is obtained by multiplying 
r„(x) by the inverse of its leading coefficient) is d(x). 
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(, iii ) We can express d(x ) as a linear combination of f(x) and g(x ); that is, 
we can find s(x ), f(x) e F[x] for which 

d(x) = s(x)f(x) + t(x)g(x). 

(iv) The gcd d(x) is the monic polynomial of smallest degree which can 
be expressed as a linear combination of f(x) and g(x). 

Proof : (/) uniqueness. Suppose that both d(x) and d'(x) are gcd’s. Then 
they must divide each other— d(x) \ d'(x) and d'(x) | d(x). Because d(x) and 
d'(x) are both monic, we must have d(x) = d'(x). This shows that if a gcd 
exists, then it is unique. 

existence. Let d(x) be the unique monic associate of r n (x ), the last 
nonzero remainder of the Euclidean algorithm of f(x) and g(x )—so if d ^ 0 is 
the leading coefficient of r n (x ), then d(x) = d~ 1 r n (x). Thus, d(x)\r n (x) and 
working upward through the Euclidean algorithm we obtain, in turn, 

d(x) I /•„_!(*), d(x) I r n _ 2 (x),d(x)\f(x), d(x)\g(x) 

—so d(x) is a common divisor of f(x) and g(x). 

On the other hand, if c(x) \f(x ) and c(x) \g{x), then working downward 
through the Euclidean algorithm we obtain in turn, 

c(x) | r x (x), c(x) | r 2 (x), ..., c(x) \ r n (x) 

—so c(x) | d(x). Hence, d(x) satisfies all the requirements for the gcd of f(x) 
and g(x). 

This proves (/) and (ii). Of course, it is essentially the same proof as was 
given for 1-3-2. 

(iii) By working either upward or downward through the Euclidean 
algorithm (as was done in 1-3-6) we see that r n (x ) can be expressed as a linear 
combination of f(x) and g(x) —say, 

r n (x) = s\x)f{x) + t'(x)g(x). 

Then if, as before, d denotes the leading coefficient of r n (x ), we have 
d(x) = d~ l r n (x) ={d~ i s\x))f(x) + (d~ 1 t , (x))g(x) 

and putting 5*(x) = d~ l s'(x) 9 t(x) = d~ i t'(x) does it. 

(iv) Suppose d\x) is a monic polynomial of degree less than or equal to 
deg d(x) which can be expressed as a linear combination of f(x) an dg(x )—say 

d'(x) = p(x)f(x) + q(x)g(x). 

Then d(x)\d\x) and deg d(x) < deg d\x). Therefore, deg d(x) = deg d\x ); 
and because d(x) \d\x) with both of them monic, we conclude that d(x) = 
d f (x ). This completes the proof. | 
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3-4-11. Remarks. (1) In Z, we have the notion of order and the conse¬ 
quent notion of 66 size ” of an integer. In general, there is no ordering on a 
polynomial ring F[x] ; however, the notion of degree of a polynomial can often 
be used to give a rough measure of size. For example, in the Euclidean algo¬ 
rithm, the degree provides a measure of the size of the remainders, and it is 
the key ingredient for guaranteeing that the Euclidean algorithm has only a 
finite number of steps. 

(2) What is the significance of the requirement that d(x) be monic in the 
definition of gcd (see 3-4-9) ? Suppose our definition of gcd requires only that 
conditions (/) and (it) hold—that is, that d(x) is a common divisor of f(x) and 
g(x) which is divisible by every common divisor. Then any two polynomials 
which satisfy these conditions divide each other; so they are associates, and 
belong to the same equivalence class with respect to ~ (the equivalence rela¬ 
tion on F[x] which we discussed earlier in this section). The only way to assure 
uniqueness of the gcd is to specify, somehow, a choice of a single element of 
this equivalence class; clearly, taking the unique monic one is a convenient 
way to do this. 

(3) Incidentally, the requirement that the gcd be monic plays exactly the 
same role as the requirement, in Z, that the gcd d of two nonzero integers a 
and b be positive (see 1-3-1). In more detail, suppose the definition of gcd in 
Z only calls for a common divisor of a and b which is divisible by every 
common divisor. Then any two integers d u d 2 which fulfill these conditions 
for gcd must divide each other. As in 3-4-1, this implies that d 2 equals d t times 
a unit of Z; so, as in 3-4-2, we may say that d 2 is an associate of d t and write 
d 2 ~ d x . Now, ~ is clearly an equivalence relation on Z [really, on Z — (0)], 
and the equivalence class 1 d x \ „ determined by d t consists of all integers of 
form d i times a unit. But there are only two units in Z—namely, ±1—so 
the equivalence class 1 d x | „ consists of the two elements d u —d t (and, in 
general, any equivalence class has exactly two elements). In particular, 
d 2 — ±d x . To obtain uniqueness for the gcd, we must specify an element of 
the equivalence class 

[4J ~ = {du ~d,} 

and the best way to do this is to choose the positive element of the pair. 

(4) Consider the concrete example of 3-4-8. If 

f(x) = 12x 3 + I6x 2 — 3x, g(x) = 12x 4 + 4x 3 — I3x 2 + 14x + 3 

in R[x], then the last nonzero remainder in the Euclidean algorithm is 2x -F 3— 
so the gcd is d(x) = x + \ . To express 2x + 3 as a linear combination of f(x) 
and g(x ), we have 

2x + 3 = f(x) - (2x - l)(6x 2 + 11* + 3) 

=f(x) - (2x - 1 )[g(x) - (x - 1 )/(*)] 

= (2x 2 — 3x + 2 )f(x) + (1 — 2x)g{x). 
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Consequently, 

d(x) = x + f = (x 2 - fx + l)/(x) + (| - x)g(x). 

In similar fashion, consider 

f(x) = 2x 3 -f 5x 2 + 5x -f 3, g(x) = 4x 3 -f 3x — 3 

in Z 7 [x]. The last nonzero remainder of the Euclidean algorithm is 6x -f 2, 
which is expressed as a linear combination of f(x) and g(x) by 

6x -f 2 =f{x) — (4x + 3)(4x 2 — 2) 

=/(*) - (4x + 3)(g(x) - 2/(x» 

= (8x + 7)/(x) + (—4x - 3)g(x) 

= x/(x) + (3x + 4)g(x). 

Then, 6 _1 (the inverse of 6 in Z 7 ) equals 6, since 6 • 6 = 1 in Z 7 ; so d{x), the 
gcd, equals x + 5, and 

d(x) = 6~ 1 (6x + 2) = x + 5 = (6 x)f(x) + (4x + 3 )g(x). 

(5) In virtue of 3-4-10, the gcd of f(x) and g(x) is the monic polynomial of 
the largest degree which is a common divisor. As a matter of fact, it is not 
hard to see that this property may be taken as the definition of the gcd of 
f(x) and g(x). However, the definition we have given in 3-4-9—as the monic 
common divisor which is divisible by every common divisor—is the more 
appropriate one for generalization to other rings in which one may wish to 
consider gcds. 

(6) If d(x) = (/(*), g(x)) = 1, then f(x) and g(x) are said to be relatively 
prime. In particular, if the last nonzero remainder r n (x) of the Euclidean 
algorithm for f(x) and g(x) is a constant— r n (x ) = # 0, de F —then d(x) = 
d~ 1 r n (x) = 1, so f(x) and g(x) are relatively prime. 

(7) Our discussion of gcd may also be extended to situations where the 
Euclidean algorithm provides no information; for example, when f(x)\g(x) 
or when one of the polynomials f(x ), g(x) is 0. The treatment of such situations 
parallels the remarks made in 1-3-11 and 1-3-12. 


3-4-12. Proposition. If f(x) \g(x)h(x) while f(x) and g(x) are relatively 
prime, then f(x) \ h(x). 


Proof : Since (/(*), g{x)) = 1 there exist s(x), t(x) e F[x] with 
1 = s(x)f(x) + t(x)g(x). 
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Therefore, 


h(x) = s(x)f(x)h(x) + t(x)g(x)h(x) 

and both terms on the right side are divisible by f(x). Hence,/(x) | h(x). | 

3-4-13. Definition. Suppose p(x) e F[x] is a nonzero polynomial which is 
not a unit. Then p(x) is said to be prime or irreducible in F[x] (or over F) when 
the only divisors of p(x) are the units and the associates of p(x). 

According to the definition, only polynomials of degree greater than or 
equal to 1 are candidates for being primes. Now, any polynomial /(x) of 
degree greater than or equal to 1 is automatically divisible by all the units 
(that is, by every element of F*) and also by all its associates (that is, by all 
elements of form f(x)u with u a unit, or to put it another way, by every 
element of [/(*) L ). The primes are those polynomials which have no divisors 
except the automatic ones. Our definition of irreducibility, or primeness, is 
equivalent to another very common formulation as given in the next result: 


3-4-14. Proposition. Suppose p{x) e F[x] is a polynomial of degree n> 1; 
then 

p(x) is prime <=> p(x) cannot be expressed as a product of two 
polynomials of degree greater than or equal to 1 

or, what is equivalent, 

p(x) is not prime <=> p(x) can be expressed as a product of 
two polynomials of degree greater than or equal to 1. 


Proof : We shall prove the second version after recalling a few basic facts 
about polynomials in F[x] —namely: 

(0 deg (f(x)g(x)) = deg/(x) + deg g(pc), 

(ii) /(x) is a unit if and only if deg/(x) = 0, 

(iii) if two polynomials are associates they have the same degree. 

Suppose p(x) is not prime; so there exists a divisor f(x) of p(x) which is not 
a unit or an associate—and we may write 

p(x) =f(x)g(pc). 

Since f(x) is not a unit, deg/(x) > 1. If deg#(x) = 0, then g(x) is a unit and 
f{x) is an associate of p(x) —a contradiction. Hence deg g(x) > 1, and we have 
proved the implication =>. 
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Conversely, suppose p(x) can be expressed as a product p(x) = f(x)g(x) of 
two polynomials of degree greater than or equal to 1. We have deg/(x) > 1, 
so f(x) is not a unit. Since deg g(x) > 1 and deg/(x) + deg g(x) = deg p(x) = 
n , we have deg/(jc) < n —so f(x) is not an associate of p{x). Thus, p(x), being 
divisible by f(x) which is neither a unit nor an associate, is not prime. This 
proves the implication <=. | 

Naturally, a polynomial of degree 1 or more that is not irreducible (that 
is, it has a 66 nontrivial ” factorization) is said to be reducible or composite. 

Obviously, any polynomial of degree 1 is irreducible. The question of 
irreducibility for polynomials of degree 2 or more is complicated, and we shall 
eventually have a few things to say about it—but here we stick to the main 
theme. 


3-4-15. Proposition. Suppose p{x),f{x),f 2 {x),...,f n {x) are polynomials 
of degree 1 or more over F. If p(x) is prime and 

p(x) \fi(x)f 2 (x) ■ • •/„(*), 

then p(x) divides at least one of the/)(x). 


Proof : If p(x) divides f(x ), we are finished. If p(x)jfj i(x), then because 
p(x) is prime it is easy to see that p(x) and f(x) are relatively prime, 
( p( x Xfi ( x )) = 1- Application of 3-4-12 now yields 

P(x) I/ 2 W/ 3 W ‘ * */»(*) 

—so, proceeding inductively, we obtain the desired result. | 

3-4-16. Theorem. Let f(x)eF[x] be a polynomial of degree 1 or more. 
Then /( x) can be expressed uniquely (up to order) as a product of a non¬ 
zero constant (that is, an element of F*) and a finite number of monic 
irreducible polynomials over F. 


Proof, uniqueness. Suppose f{x ) has two factorizations 

fix) = ap l {x)p 2 {x) • • • pXx) = bq l (x)q 2 (x) ■ • • qfx) 

where pf c),..., p r (x ), qfx ),..., q s (x) are monic and irreducible, and a, b are 
nonzero elements of F. Since a product of monic polynomials is monic, we 
see that a is the leading coefficient of f(x ), and so is b. Hence, a = b, and by 
cancellation we have 

Pi(x)p 2 (x) • • • pfx) = qXx)q 2 {x) • • • qfx). 
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Now, Pi(x) divides the right side, so according to 3-4-15 it divides at least one 
of the q t {x) —and we may reindex so that Pi(x)\q 1 (x). Since both of these are 
monic and prime, it follows that they are equal. Then by cancellation, 

Pi(x) • ■ ■ Pr(x) = q 2 {x) • • • q s (x) 

and an induction takes care of the rest. 

existence . Suppose the desired factorization does not exist for some 
polynomial of degree greater than or equal to 1. Let f(x) then be a polynomial 
of minimal degree greater than or equal to 1 which does not factor into the 
product of a nonzero constant and a finite number of monic irreducible 
polynomials. Then f{x) is not irreducible, so it factors into f(x) = g(x)h(x ), 
both of which have degree greater than or equal to 1. Since both g(x) and h(x) 
have degree less than deg/(x), minimality tells us that both g(x) and h(x) 
factor into the product of a nonzero constant and a finite number of monic 
irreducible polynomials. Putting these together provides f(x) with a factoriza¬ 
tion of the same form—contradiction. | 

Our development of unique factorization in F[x ] is substantially the same 
as the development of unique factorization in Z that was given in Chapter I, 
especially in Section 1-4. But there are some insubstantial alterations. 
According to the definition of a prime in F[x], if p(x) is prime, then so is any 
associate p(x)u , u e F* (for example, in R[x], 2x -f 3, x -f (3/2), nx -f (3n/2) 
are associates of each other and each of them is a prime). Since nothing is said 
about this, the presumption is that these are distinct primes. But this is 
inappropriate, since all polynomials in the equivalence class \ p(x) are inter¬ 
changeable in questions of divisibility and factorization. Thus, when we want 
to have uniqueness of prime factorization, it is necessary to pick a single ele¬ 
ment from | p(x) |„. This is the reason why the irreducible polynomials in 
3-4-16 are required to be monic. 

On the other hand, in Z, instead of permitting either element of an 
equivalence class (such as |_7j_ = {7, — 7} = | —7 | „) to be considered as a 
prime, we always ignored the negative element and required that a prime 
be positive. The analogous choice for F[x ] would be to permit only monic 
polynomials to be irreducible. Of course, the reader who wishes can adjust the 
definitions of a prime in Z and in F[x] to make them correspond more exactly, 
and then carry out the two treatments in exactly parallel fashion. In any case, 
the important objective is always to arrive at unique factorization into primes. 

The reader should also note that the prime factorization of a polynomial 
includes a nonzero constant (in other words, a unit of F[x ]). This corresponds 
to the fact that the prime factorization of an integer n includes a term + 1 (that 
is, a unit of Z). 

The problem of determining the prime factorization of a concrete poly¬ 
nomial or of just deciding if a given polynomial is irreducible is, to put it 
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mildly, far from trivial. Much depends on the structure of the field F. For 
example, let us look at a simple polynomial of degree 2, x 2 + 1 in F[x]. If it 
is reducible over F (that is, if it factors in F[x ]), its factorization must be of 
form 

x 2 + 1 = (x — a)(x — b), a,beF. 

So its reducibility will depend on whether the polynomial x 2 + 1 has a root 
in F . Thus: 

(i) x 2 + 1 is irreducible in R[x], because there is no real number whose 
square is —1. 

(ii) x 2 + 1 is reducible in C[x] —in fact, for i = V — 1, 

x 2 + 1 = (x — i)(x + /). 

(iii) x 2 +1 is reducible in Z 2 [x ]—in fact, here 

x 2 + 1 = (x + l) 2 . 

(iv) x 2 +1 is irreducible over Z 7 because, by trial and error, one sees that 
x 2 + 1 has no root in Z 7 . 

(i?) x 2 + 1 is reducible over Z 13 —in fact, in Z 13 [x] 

(x 2 + 1) = (x — 5)(x — 8). 

We shall return to such questions later. 

3-4-17 1 PROBLEMS 

1. Find all units of the polynomial rings: 

(0 QM, (ii) RM, (iii) Z 2 [x], 

(iv) Z 3 [x], (v) Z 7 [x], (vi) Z n M, 

(i?ii) Z p [x], when p is a prime. 

2. List all the associates of 2x 2 — x + 1 in 

(0 Z 3 [x], (ii) Z s [x], (iii) Z 7 [x], (ii;) Z n [x\. 

3. Consider the polynomials 

f t (x) = 3x 2 — 4x + 2, f 2 (x) = 4x 2 + 3x + 1, 
f 3 (x) = 2x 2 -2x + 4, f A (x) = 4x 2 + 2x - 1. 

Which of these are associates in: 

(0 Z 5 [x ] 9 (ii) ZM, (iii) Z n M? 

4. Consider the polynomials f(x) = x 4 , g(x) = x 6 , h(x) = x 3 in Z 7 [x], and 
let /, g , h denote the associated polynomial functions in Map( Z 7 , Z 7 ). 
Is it true that/• g = hi 
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5 . Which of the following is an equivalence relation on the given set? 

(7) X is the set of all triangles in the plane; p means “has the same 
perimeter as.” 

(ii) X is the set of all points in the plane; p means “is at distance less 
than or equal to 2 from.” 

(iii) X is the set of all living human beings; p means “is the father of.” 

(iv) X is the set of all living human beings; p means “is a brother of.” 

(v) X is the set of all living human beings; p means “has an ancester 
in common with. ” 

(vi) X = F[x] and for a fixed /(x) of deg>l, g(x) p h(x) means 
“ g(x) — h(x) is divisible by /(x) (where g(x) and h(x) are arbitrary 
elements of F[x ]).” 

(vii) X is the collection of all sets; p means “can be put in 1-1 corre¬ 
spondence with.” 

(viii) X — Z; m p n means “m — n is even.” 

(*x) X = Z; m p n means “m — n is odd.” 

For those that are not equivalence relations, list the requirements that are 
not satisfied. 

6. For each of the following, describe a set X and a relation p on it such that 
p is: 

(/) reflexive, but neither symmetric nor transitive, 

(ii) symmetric, but neither reflexive nor transitive, 

(iii) transitive, but neither reflexive nor symmetric, 

(iv) reflexive and symmetric, but not transitive, 

(v) reflexive and transitive, but not symmetric, 

(vi) symmetric and transitive, but not reflexive. 

7. Consider the polynomial f(x) = 3x 2 — 4x + 2. Find the unique monic 

polynomial belonging to the equivalence class \f(x) | „ when/(x) is viewed in 
(0 Q[x], (ii) Z 5 [x], (iii) Z 7 [x], (iv) Z n [x]. 

8. (a) Compute the division algorithm in Q[x ] for each of the following 
pairs of polynomials: 

(0 f(x) = x - 3, g(x) = x 3 + x 2 - 13x + 3, 

(ii) f(x) — x + 2, g(x) = 2x 5 + 4x 3 — lx 2 + 5x + 3, 

(iii) f(x) = 2x — 1, g(x) = x 4 — 3x 2 + x + 5, 

(iv) f(x) = x 2 + 2, g(x) = 3x 7 — 4, 

(v) f(x) = 3x 2 — 2, g(x) = 2x 6 — x + 5. 

(b) Do the same thing when f(x) and g(x) are viewed in R[x]. 

9. Perform the division algorithm for each pair of polynomials in Problem 
8 when they are viewed in 

(0 Z 2 [x], (ii) Z 3 [x], (iii) Z 5 [x], (iv) Z 7 [x]. 
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10. Compute the division algorithm for fix) = 4x 3 — 2x 2 + 1 and g(x) = 
4x 5 + 2x 4 + 2x 3 — 2x 2 + x — 2 when they are viewed in the polynomial 
ring 

(0 QM, 0*0 R[*L («0 z 3 [x\, 

(iv) Z 5 [xl (v) Z 7 [*]. 

11. Suppose R is a commutative ring with unity and/(x) e A[x] is a poly¬ 
nomial of degree greater than or equal to 1 whose leading coefficient is a 
unit of R. Show that we have a division algorithm for fix) and any 
gix) e R[x]. Is it unique? 

12. List as many places as you can where an equivalence relation has come 
up in this book. 

13. Suppose fix), g(x) e F[x] are both monic. Is it true that fix) + gix), 
f{x) — g(x), and f(x) • g(x) are monic? 

14. For each pair of polynomials in Q[x] find their gcd and express it as a 
linear combination of them. 

(i) f(x) — 2x 3 — 4x 2 -F x — 2, gix) = x 3 — x 2 — x — 2, 
iii) fix) = x 4 + x 3 + x 2 + x + 1, gix) = x 3 — 1, 

(iii) fix) = x 2 + x + 1, gix) = x 4 + x 2 + 1, 

(iv) fix) = x 3 — 1, gix) = x 5 — x 4 + x 3 — x 2 + x — l. 

15. For each pair of polynomials in the appropriate polynomial ring F[x], 
find the gcd and express it as a linear combination of them. 

(0 fix) = x 2 + x + 1, gix) = x 4 + x + 1 in Z 2 [x], 
iii) fix) = x 2 + 1, gix) = x 5 + 1 in Z 2 [x], 

(iii) fix) = x 2 — x + 4, gix) = x 3 + 2x 2 + 3x + 2 in Z 3 [x], 

(iv) fix) = x 2 + 4, gix) = x 3 + 2x 2 + 3x + 2 in Z 5 [x]. 

16. Exhibit two polynomials of degree 3 in R[x] such that the last nonzero 
remainder of their Euclidean algorithm is x — 2. 

17. Suppose F is a field and AT is a field which contains F (for example, 
F = Q and AT = R). Consider polynomials f(x) 9 g(x)eF[x]. Show that 
their gcd when they are viewed as polynomials in K[x] is the same as 
their gcd when they are viewed as polynomials in F[x], 

18. Suppose that for nonzero polynomials /(*), gix) e F[x] we define their 
greatest common divisor to be the monic polynomial of biggest degree 
which is a common divisor of fix) and gix). Show that this definition of 
gcd is equivalent to the one we have given in 3-4-9. 

19. Show that our results about gcd are also valid when fix) \ gix) or when 
one of the polynomials fix), gix) is 0. Things here go as indicated in 
1-3-11 and 1-3-12. 
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20. (7) Suppose p(x\f(x) e F[x] are both of degree greater than or equal to 1. 

If p(x) is prime and p(x)Xf(x ), show that ( p(x),f(x )) = 1. 

(ii) If p(x) and q(x) are both monic and prime with p(x)\q(x), then 
p(x) = q(x). 

21. Given/(x), g(x ), h(x) e F[x ], we have: 

(0 (/(x), g(x)) = 1 if and only if there exist ^(x), t(x) e F[x] for which 
s (x)f (x) -f* t(x)g(x) = 1. 

(H) If fix) | h(x), g(x) | h(x) and (f(x), g(x)) = 1, then f(x)g(x) \ h(x). 

(iii) If (f(x), g(x)) = 1 and (f(x), h(x)) = 1, then (f(x), g(x)h(x)) = 1. 

(iv) If ( f(x), g(x)) = 1 and h(x)\f(x), then (h(x), g(x)) = 1. 

(v) If ( f(x ), g(x)h(x)) = 1, then (f(x), g(x)) = (f(x), h(x)) = 1. 

(vi) If ( f(x), g(x)) = 1, then (f(x)h(x), g(x)) = (h(x), g(x)). 

22. How should one account for the fact that in Z 15 [x], x 2 — 1 = 
(x — l)(x — 14) =(x — 4)(x — 11) has two distinct factorizations into 
primes ? 

23. Decide if the polynomial x 2 + x + 1 is irreducible over each of the follow¬ 
ing fields; if it is reducible, factor into irreducible polynomials. 

(0 Q, (ii) R, (iii) C, (iv) Z 2 , 

(v) Z 3 , (vi) Z 5 , (vii) Z 7 , (viii) Z u . 

24. Do Problem 23 for each of the following polynomials: 

(i) x 2 + 2, (ii) x 2 — 2, (iii) x 3 — 1, 

(iv) x 3 + 1, (i?) x 2 + 3x — 2, (vi) x 3 — 2. 

25. Suppose /(x) is a monic polynomial of degree 2 or 3 over the field F. 

Show that /(x) is irreducible over F of(x) has no roots in F. Why does 
this statement not carry over when deg/(x) = 4? 

26. Consider the polynomial /(x) = x 7 — xe Z 7 [x]. Find all its roots. Can 
you find its factorization into primes? 

27. Consider the polynomial /(x) = x 2 + x +1 e Z 5 [x] and compute g(x) = 
(/(x)) 5 = (x 2 + x + l) 5 . Compute the associated polynomial functions 
/, ^eMap(Z 5 , Z 5 ) (that is, specify their values for each element of 

Z 5 ) and show that / = g. 

28. Suppose R and S are commutative rings with unity, and (j) : R -> S is a 
homomorphism. Define a mapping 

J?[x]-+S[x] 

as follows: If/(x) = I^x l e i?[x], then 4> # (f(x)) [which may be written 
as / # (x)] equals £ <j)(a^x\ Show that is a homomorphism; if (j) is 
injective, so is if 0 is surjective, so is if 0 is an isomorphism, so 
is <f > # . 
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29. For any m > 2, let 0: Z -> Z w be the canonical map 0(a) = so 

(with the notation of Problem 28) Z[x] -► Z w [x] is a surjective homo¬ 

morphism. Prove that ker 0# consists of those polynomials all of whose 
coefficients are divisible by m. 

30. By making use of the mapping 0 # : Z[x] -> Z p [x ] prove that if f(x ), 
#(x), /*(x) e Z[x] with /(x) = g(x)h(x) and every coefficient of f(x) is 
divisible by the prime p , then at least one of #(x) or h(x) has the property 
that all its coefficients are divisible by p. 

31. (0 Does f{x) = x 2 + 3 divide g(x) = x 5 + x 3 + x 2 — 9 in Z[x]? 

(ii) Find all m > 2 such that f(x) \g{x) in Z w [x]. 

(in) Do the same for fix) = x 2 + 2, g(x) = x 6 + 5x 5 + 5x 4 + 10x 3 
+ 8x 2 + 4. 

32. For each monic irreducible polynomial p(x) e F[x] define v p{x) (f(x)) for 
any/(x)#0e F[x] as the exponent to which p(x) appears in the prime 
factorization of fix). State and prove results concerning the v p(jc) ’s 
analogous to those discussed in Section 1-5 for the v p ’s. 

33. Define “least common multiple” in F[x]. Discuss its properties, and 
prove them. 

3-5. Roots of Polynomials 

Throughout this section, F will denote an arbitrary field, D an integral 
domain, and R a commutative ring with unity. To keep the discussion 
consistent, our general results, which deal mainly with roots of polynomials, 
will be stated in F[x]; but many of them are valid in Z)[x], and some even hold 
in R[x]. These results are generalizations of facts that are more or less familiar 
from high school—where instead of working over a field F, we invariably 
considered only polynomials with coefficients coming from the fields Q, R, or 
C. From these general considerations, we shall pass to some remarks for 
specific choices of F—namely, Q, R, C, and (because of its number-theoretic 
importance) Z p . 

3-5-1. Remainder Theorem. Consider an arbitrary polynomial/(x) e F[x]; 
then for any ceF the remainder upon division of/(x) by (x — c) is 
precisely /(c). 

Proof. According to the division algorithm, when (x — c) is divided into 
fix) we obtain, uniquely, a quotient qix) and a remainder r(x) with deg r(x) < 
deg(x — c) = 1. Thus, r(x) is a constant—r(x) = re F—and we have 

fix) = q(x)(x -c) + r. 
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Now, substitution gives 

f{c) = q(c)(c — c) + r — r 
which completes the proof. | 

The remainder theorem may be rephrased as asserting the existence of a 
unique q{x) e F[x] such that 

fix) = q(x)(x - c) +/(c). 

So as a corollary we have: (x — c) \f(x ) <=> there exists q{x) e F[x] for which 
f(x) = q(x)(x — c) o f(c) = 0 o c is a root of f(x). This proves: 

3-5-2. Factor Theorem. For ce F and f(x) e F[x ] we have 
(x — c) divides f{x) o c is a root of f(x). 

3-5-3. Remark. A stronger form of the factor theorem has already been 
proved. In fact, in 3-3-13, we proved it for f(x) e jR[*], ceR, by another 
method. It is worth noting that the method of proof used here in 3-5-1 and 
3-5-2 can be carried over to R[x], In more detail, even though there may not 
be a division algorithm for every pair of polynomials in jR[x], we can derive a 
division algorithm for (x — c) e R[x] and f(x) e R[x] precisely because ( x — c) 
is monic. In other words, given ceR and /(x)ei?[x] there exist unique 
q{x) e R[x] and r e R such that 

f(x) = q(x)(x - c) + r (*) 

and from this, the remainder theorem and the factor theorem follow as above. 
Incidentally, the reader can easily prove (*) informally by actually doing the 
long division; because (x — c) is monic there is nothing to stop the process 
until one arrives at a remainder with no x’s—that is, a constant. Or else, one 
can prove (*) formally by rewriting the proof of 3-4-6 (including uniqueness) 
practically word for word; everything goes through because the leading 
coefficient of (x — c) is 1. 

3-5-4. Theorem. Suppose f(x) = a 0 + a t x + • • • + a n x n e F[x ] is a poly¬ 
nomial of degree n > 1, and suppose we have n distinct elements 
c u c 2 , ..., c n of F, all of which are roots of f(x); then /( x) has the factor¬ 
ization 

f(x) = a„{x - c t )(x - C 2 ) • • • (x - c„). 


Proof: The idea of the proof is simple. Since q is a root of/(x) there 
exists /i(jt) e F[x] for which 


fix) = (x - cJ/iCx). 
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Then c l9 ... 9 c n are roots of fi(x), because they are distinct from q. In 
particular, we can write /i(x) = (x — c 2 )f 2 (x ), so that 

f(x) = (x- c t )(x - c 2 )f 2 (x), f 2 (x) 6 F[x], 

This process continues until all the c *s are exhausted, at which stage the 
expression looks like 

fix) = (x- c t )(x - c 2 ) • • • (x - c„)f„(x), f„(x) 6 F[x]. 

But now, by comparing degrees and leading coefficients, it follows that 
Lix) = a n . 

Rather than being content with the preceding sketch of a proof, let us give 
a rigorous, formal proof by induction on n. 

If n = 1, then our polynomial is of form f(x ) = a 0 + a Y x , ^ 0, and we 

have a root c 1 of f(x). Consequently, 0 = /(q) = a 0 -F a 1 c i —so that c t = 
— a\ x a 0 and 

f(x) = afx + a I ‘a 0 ) = afx - c). 

Thus, the theorem holds for n = 1. 

Now suppose inductively that the theorem holds for n (that is, for any 
polynomial of degree n and n of its roots) and suppose we have n + 1 distinct 
roots c l9 c n+l (in F ) of a polynomial f(x) = a 0 + a^x + • • • + a n x n 

4* a n+l x n + 1 e F[x ] of degree n + 1. Since c n + l is a root, there exists f^x) e F[x] 
such that 

fix) =fiix)ix - c„ + 1 ). (#) 

For each / = 1,2,we then have 

o =ficd = flic die; - C„ +1 ). 

But Ci — c n + 1 # 0 for / = 1,..., because c u c 2 ,..., c n , c n + l are distinct; 
so that fi(ci) = 0. Consequently, we have n distinct roots c u c 2 9 ..., c„ of 
/ x ( jc), and from (#),f 1 (x) is clearly a polynomial of degree n whose leading 
coefficient is a n + l . By the induction hypothesis, we know that 

flix) = a n+1 ix - c t )(x ~c 2 )---(x- c„) 

and hence 

fix) = a„ +i (x - C t )ix - C 2 ) • • • (x - C n )(x - c n + i ). 

So the theorem holds for n + 1. This completes the proof. | 

This result has diverse and important consequences of which the most 
familiar is the following. 

3-5-5. Corollary. A polynomial of degree n > 1 over the field F can have 
at most n distinct roots in F . 
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Proof : By contradiction. Suppose fix) = a 0 + a Y x -F • * • + a n x n e F[x] is a 
polynomial of degree n which has more than n distinct roots. In particular, 
f{x) surely has n + 1 distinct roots in F —call them c l9 ..., c n9 c„ + l . According 
to the theorem applied to f(x) and the n distinct roots c l9 c 2 , ..., c n [observe 
that the hypothesis of 3-5-4 does not require c l9 ..., c n to be all the roots of 
f{x)\,f{x) has the factorization 

/(*) = a n (x - c,)(x - c 2 ) • • • (x - c„). 

But then, since c n+l is a root, 

0 = fi c n + l) = a n( c n + 1 “ c l)( c n + l “ c 2 ) * * ' fei+l “ O' 

Because we are in a field, at least one of the factors on the right-hand side 
must be 0; however, by our hypotheses, a n ^ 0 and c l9 c 2 ,..., c n9 c n + 1 are 
distinct—contradiction. | 


3-5-6. Corollary. If n > 0 and f{x) e F[x] is a polynomial of degree less 
than or equal to n which has n + 1 (or more) distinct roots in F, then f(x) = 
0—that is,/(x) is the zero polynomial. 


Proof. Suppose deg/(x) = k < n. If k = 0, then, according to the hypoth¬ 
esis f{x) has at least one root in F; but this is impossible because a poly¬ 
nomial of degree 0 is a nonzero constant, and thus has no roots. If k > 1, then, 
by hypothesis, fix) has n + 1 > k + 1 distinct roots in F, which contradicts 
3-5-5. The only possibility remaining is k — deg fix) — — oo—in other words, 
fix) = 0. | 


3-5-7. Corollary. Suppose n > 0 and fix), gix) e F[x] are both of degree 
less than or equal to n. If their values agree at n + 1 (or more) elements of 
F, then fix) = gix). 


Proof: The essential assertion here is: If c l9 c 2 ,..., c n+l are distinct 
elements of F, and 


f(c t ) = gic-f i= 1, 2, ... 9 n+ 1, 

then fix) =gix). But this is easy. In fact, 0(x) = fix) — gix) is a polynomial 
of degree less than or equal to n which has n + 1 distinct roots in F, since for 
i = 1,...,« + 1 


<t>ic i )=fic i )-9ic i ) = 0. 

According to 3-5-6, $(x) = 0; so fix) = gix). | 
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3-5-8. Corollary. Suppose the field F has an infinite number of elements. 
Then distinct polynomials over F give rise to distinct polynomial func¬ 
tions; in other words, the homomorphism of rings 

A:F[x]->Map(F, F) 

treated in 3-3-11 is injective. 


Proof : From 3-3-11, we know that A is a homomorphism; so to prove A 
is injective, it suffices to show ker A = (0). 

Suppose f(x ) e ker A. Then its image / under A is 0—/ = A(/(x)) = 
0 e Map(F, F). Now an element of Map(F, F) equals 0 when it is the mapping 
which takes the value 0 for every element of F ; hence,/(c) = 0 for every ce F. 
According to the definition of the polynomial function /, we have 

f(c) =/(c) = 0 for every ce F. 

Thus, every element of F is a root of f(x) and—this is the crucial step— 
because F is infinite,/(x) has an infinite number of distinct roots in F. 

However for the given polynomial f(x)e ker A, there surely exists an 
integer n > 0 for which deg/(x) < n. Since /(x) has more than n + 1 distinct 
roots, 3-5-6 guarantees that /(x) = 0. | 

3-5-9. Remarks. (1) It has already been observed that 3-5-1 and 3-5-2 
hold when F is replaced by D , or even by R. Here we note that all the subse¬ 
quent results—namely, 3-5-4 to 3-5-8—are also valid over any integral domain 
D and, in particular, over Z. It is a straightforward matter for the reader to 
check that in this situation our proofs of these results go through, word for 
word. 

However, these results do not carry over for general R. For example, over 
R = Z 15 (which is not an integral domain) the polynomial x 2 — 1 is of degree 
2, but has the four roots 1, 4, 11, 14 in Z 15 ; so, in particular, 3-5-5 is not 
valid over Z 15 . 

(2) In high school, the polynomials considered always came from Z[x], 
Q[x], R[x], or C[x]. Because Z, Q, R, C all have an infinite number of ele¬ 
ments, 3-5-8 tells us that in these cases distinct polynomials give rise to distinct 
polynomial functions. This fact justifies the customary carelessness in high 
school, where the distinction between polynomials and polynomial functions 
is glossed over. It is imprecise to do so, but it does not lead to seriously wrong 
results. 

(3) When F has only a finite number of elements, the proof of 3-5-8 
breaks down, and, in fact, the result is false (more on this in 3-5-15). For 
example, we have already seen in 3-3-10 that when F = Z pt the distinct 
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polynomials f(x) = x p and g(x) = x give rise to the same polynomial function 
—that is, ]— g , or put another way, A(/(x)) = A(#(x)). Of course, the 
polynomial 

0(x) = x p — x e Z p [x] 


then satisfies 

0(a) = 0 for every a e Z p 

—so the polynomial function associated with 0(x) is A(0(x)) = 0 = 0. Thus, 
0(x) is a nonzero polynomial in ker A—so A is not injective. 


Let us digress a bit from our main theme of studying roots of polynomials 
to discuss an interesting problem in which the previous results have an applica¬ 
tion. We work in the Euclidean plane R x R and consider a “curve-fitting” 
problem. Given a finite set of points in the plane, we wish to find a curve (if 
one exists) that passes through all of them; moreover, we want the curve to be 
the graph of some polynomial. 

This problem is not unfamiliar. For example, suppose we are given two 
points (2, 3) and (5, 7) in the plane. Of course, there is a unique straight line 
which passes through these points. In fact, one sees easily that this straight 
line is the one whose equation is 

y = jx + 

In other words, if we let f(x) denote the polynomial f x -F y of degree 1 in 
R[x], then its graph G(f ) = {(a,f(a)) | a e R} (as introduced in 3-3-12) is the 
unique straight line that passes through the points (2, 3) and (5, 7). 

More generally, if we are given the points (a t , Z^), (a 2 , b 2 ), with a Y ^ ci 2 , 
in the plane, then the reader may check that the straight line passing through 
these points has the equation 

b 2 — bi a 2 b i — a x b 2 
y = — - -x +-^—i. 

Cl 2 — Cl 2 — Cl y 


In other words, the graph of the linear polynomial 


fix) = 



x + 


a 2 bi “ <* 1^2 

a 2 — 


6 R[x] 


is the straight line passing through the points (a i9 b{) and ( a 2 , & 2 )- Note the 
need for a x ^ a 2 ; if a x = a 2 , then the straight line in question is vertical and, 
in particular, it is impossible to find a polynomial f(x) e R[x] whose graph 
passes through both points—after all, the graph of a polynomial includes 
exactly one point for each value of x. 
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What happens if we are given three points?—say, (—1, 6), (0, 1), (2, 3). 
Here, one might guess that there is a “ quadratic equation ” of form 

y = a + bx + cx 2 

whose graph passes through the three given points. If so, we may substitute 
the coordinates of the points in this equation, to obtain 

6 = a — b + c 9 
1 = a, 

3 = a + 2b + 4c. 

Thus, a = 1, b = — 3, c = 2; so the graph of the polynomial 

f{x) = 1 — 3x -F 2x 2 = 2x 2 — 3x + 1 

passes through (—1,6), (0, 1), and (2, 3). 

In general, if three points in the plane are given, we would substitute their 
coordinates in the equation y — a + bx + cx 2 thus obtaining three linear 
equations in the three unknowns a , b , c . Presumably these three equations can 
be solved (and we finally have the equation of apdrabola; or a straight line 
when c — 0, which occurs when the three points are collinear), but how do we 
know? 

Similarly, when four points in the plane are given, we can substitute their 
coordinates in the equation 

y = f(x) — a + bx + cx 2 + dx 3 , 

thus getting four linear equations in the four unknowns a , ft, c , d. Hopefully, 
these equations have a solution, and we finally have a cubic polynomial 
whose graph passes through the four given points. 

Surely, the same method (known as the method of undetermined coeffi¬ 
cients) can be extended to five, six, or more points; but it becomes increasingly 
harder to solve the simultaneous linear equations as the number of unknowns 
increases—and we cannot be certain that a solution of these equations exists, 
or whether a solution is unique. Fortunately, there is an elegant method, due 
to Lagrange (1736-1813), that settles explicitly all questions related to the 
problem of finding a polynomial whose graph passes through a finite number 
of given points. 

3-5-10. Lagrange Interpolation. Suppose n > 1 and a l9 a 2 ,..., a n are 
distinct elements of the field F . Then, given any elements b l9 b 2 , ..., ft„ 
of F there exists a unique polynomial f(x) e F[x ], of degree less than «, 
such that 




i = 1,2 
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To put it another way, we can exhibit explicitly a unique polynomial f(x) 
of degree less than n whose graph (in F xF) passes through the n points 
(<h, bi), ( a 2 , b 2 ),..., (a n , b n ) of F x F. 


Proof : For each y = 1, 2,..., n let us put 

9j(x) = n (* - «()• 

i*j 

Thus, each gfx) is the product of linear terms (x —aj) e F[x ], one for each 
i -f- j —so clearly gfx) is a polynomial of degree n — 1 in F[x]. Since a l9 a 29 
a n are distinct, we observe that for each y, 


9j( a j) = FI (aj - ad # 0. 

i*j 


On the other hand, when i ^y, one of the terms in the product used to compute 
9j(ad is (a ; - a f )—so 

9 = 0, for i J=j. 

Now, for each y = 1, 2,..., n we put 


//*) = 


i 

9j( a j) 


9j(x). 


All we have done is to multiply the polynomial gfx) by the nonzero field 
element l/gfa/) [and this is valid because gfaf) ^ 0]. Thus, each ffx) is a 
polynomial of degree n — 1 in F[x], and the values taken by ffx) at a u a 2 ,..., 
a n are 

f/aj) = i, 

fMi) = 0, for / #y. 

Then, 

fix) = bifix) + b 2 f 2 (x) + ■■■ + b„f n (x) 


[note the analogy of this construction of fix) with the way we proved the 
Chinese remainder theorem 3-1-8] is a polynomial of degree less than or 
equal to n — 1 in F[x], and it satisfies 


f(a i ) = b i , /= 1,2, 


As for uniqueness, if f'(x) e F[x] is also a polynomial of degree less than 
or equal to n — 1 such that f\a^ = b { for i = 1,..., «, then we have two 
polynomials f(x),f'(x) of degree less than or equal to n — 1 whose values 
agree at n elements of F —so, according to 3-5-7,/(x) = g(x). | 
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3-5-11. Example. Consider the five points 

(-1,10), (1,2), ( 2 , 1 ), (3,18), (0,3) 

in the Euclidean plane, R x R. Let us find the unique polynomial of degree 
less than or equal to 4 (in R[x]) that passes through these five points. This will 
be done by the method used in the proof of Lagrange interpolation. 

We have, in the notation of 3-5-10, n = 5 and 

a i = — 1 , a i = 1 , a 3 = 2, a ^ = 3, a 5 = 0, 

6 1 = 10, b 2 = 2, 63 = 1 , *4=18, *5 = 3. 


Therefore, 


9i(x) = (x- 1)0 - 2)0 - 3)0 - 0) 

—since fj\(x) is, by definition, O — a 2 )( x — o 3 ){x — a A )(x — a 5 )—and 

g 2 (x) = O + 1)0 - 2 )0 - 3)0 - 0). 

03 O) = O + 1)0 - 1)0 - 3)0 - 0 ), 

9a(x) = o + 1)0 - 1)0 - 2)0 - 0), 
g$ O) = O + 1)0 - 1)0 - 2)0 - 3). 

For each j = 1,..., 5 we compute g/aj) —thus obtaining 

#i(—1) = (—2)(—3)(—4)( — 1) = 24, 

0i(l) = (2)(—1)(—2)(1) = 4, 

* 3 (2) = (3)(1X-1)(2)= -6, 

9*{ 3) = (4)(2)(1)(3) = 24, 

^ s (0) = (l)(-l)(-2)(-3) = - 6 . 

Next, we compute 

fj(x) = — L • 0 , 0 ) for j = 1, ..., 5; 

9Mj) 

upon multiplying out the terms in g/x), these turn out to be 
/iO) = ttO 4 - 6 x 3 + llx 2 - 6x), 

00) = iO 4 — 4x 3 + x 2 + 6x), 

/ 3 0) = ~i(x 4 - 3x 3 - x 2 + 3x), 

/ 4 O) = tjO 4 — 2x 3 — x 2 + 2x), 

/ 5 0) = — iO 4 — 5x 3 + 5x 2 + 5x — 6 ). 
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If we now put 

fix ) = bjfx) + b 2 f 2 (x) + b 3 f 3 (x) + b 4 f 4 (x) + b 5 f 5 (x), 
then a straightforward computation gives 

f(x) = W(x) + 2 f 2 (x) + 1 / 3 (x) + 18/ 4 (x) + 3 f 5 (x) 

= x 4 — 3x 3 + 2x 2 — x + 3. 

This is the desired polynomial satisfying 

/(-1) = 10, /(1) = 2, /(2) = 1, /(3) = 18, /(0) = 3. 

Returning to the central question of roots of polynomials, let us apply the 
previous results to polynomials over a finite field Z p . 


3-5-12. Proposition. If p is prime and we denote the elements of the finite 
field Z p by 0, 1, 2,...,/? — 1, then we have in Z p [x] the factorizations 

x p — x = x(x — l)(x — 2 ) • • • (x — (p — 1 )), 
x p ~ x — 1 = (x — l)(x — 2 ) • • • (x — (p — 1)). 

Moreover, for any a e Z p 

a p = a 


and for any a # 0 in Z p 


a p_1 = 1. 


Proof: Every element a e Z p = {0, 1,...,/? — 1} is a root of the poly¬ 
nomial f(x) = x p — x (since a p = a); hence,/(x) = x p — x e Z p [x] is a monic 
polynomial of degree p > 1 and the p distinct elements 0 , 1 , 2 , ..., p — 1 of 
Z p are roots. So, according to 3-5-4, we have the factorization 

x p — x = x(x — l)(x — 2) • • • (x — (p — 1)) in Z p [x]. 

Furthermore, x p — x = x(x p_1 — 1), and because Z p [x ] is an integral domain 
x may be canceled; thus leaving us with 

* p_1 - 1 = (X - 1)(* - 2) • • • (x - (p - 1)) in Z p [x]. I 

This result has a very useful number-theoretic consequence, which is 
usually ascribed to Wilson (1741-1793), although it is not clear who really 
discovered it. 
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3-5-13. Wilson’s Theorem. If p is prime, then 
(p — 1)! = — 1 (mod p). 

Even more, for any n > 1 

(n — 1)! = — 1 (mod n) o n is prime. 

Proof : Look at the factorization 

a*- 1 _ 1 = ( x - l)(x - 2) • • • (x - (p - 1)) 

in Z p [x]. In particular, the constant terms of the two sides of this equation 
must be equal. Hence, 

-l=(-l)(-2) •••(-(/>-!)). 

This equality is in Z p , so for ordinary integers it translates to 
-1 s(-l)(—2) •••(-(/>- 1))(modp). 

The right-hand side has p — 1 terms, so we have 

— 1 = ( — 1) P-1 Q> — 1)! (mod/?). 

If p is an odd prime, then ( — l) p_1 = 1 and we obtain (p — 1)! = — 1 (mod p). 
If p = 2, then surely (p — 1)! = — 1 (mod p ), since 1 = — 1 (mod 2). This 
proves the first part. 

To prove the converse, let (n — 1)! = — 1 (mod n) be given, and suppose n 
is not prime. Then there exists a prime p which divides n , and p <n. Since 
p |«, we have 

(n — 1)! = — 1 (mod p) 

and because p <n, p must be one of the terms in (n — 1)!. Thus, p\(n — 1)! 
and (n — 1)! =0 (mod p). The conclusion is 

0 = — 1 (mod p) 

—a contradiction. Hence, n must be prime. | 


3-5-14. Proposition. If a polynomial f(x) e Z p [x] is given, there exists a 
unique polynomial r(x) of degree less than p in Z p [x ] which determines the 
same polynomial function as f(x )—that is, r = /eMap(Z p , Z p ). In 
particular, r(x) has the same roots as f(x). 


Proof: existence. If deg/(x) < p , which includes the case f{x) = 0, then 
we may take r(x)=/(x); so suppose deg/(x) > p. Applying the division 
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algorithm to x p — x and fix), we obtain q(x ), r{x) e Z p [x ] for which 
fix) = q(x)(x p -x) + r(x), deg r(x) < deg(x p - x) = p. 
For each c e Z p , we have c p = c , so that 

m =f(c) 

= q{c)i cP — c) + r(c ) 

= r(c) 

= r(c). 


Thus, the polynomial functions/and r are equal; or, to put it another way, 
the polynomials f(x) and r(x) take the same value at each ce Z p ; that is, 
f(c) = r(c) for each ce Z p . In particular, f(x) and r(x) surely have the same 
roots. This show that r(x) is a polynomial which has the required properties. 

uniqueness. Suppose 5(x) e Z p [x] is also a polynomial of degree less than p 
with s =/. Then r(x) and s(x) are both of degree less than p , and since r(c) = 
5 (c) for every ce Z p their values agree at the p distinct elements of Z p . Accord¬ 
ing to 3-5-7, r(x) = 5(x). | 

To illustrate this result, let us work in Z 7 [x]. If f(x) = x 5 + 3x 2 — x + 4, 
then, because its degree is less than 7, r(x) = x 5 + 3x 2 — x + 4. If f(x) = 
x 7 — x , then r(x) = 0, since f(x) = 1 • (x 7 — x) + 0. If f(x) = x 8 , then r(x) = 
x 2 , because x 8 = x{x 7 — x) + x 2 . If f{x) — x 7 , then r(x) = x, because x 7 = 
l(x 7 — x) + x. If f(x) = 3x 9 + x 8 — x 7 — 3x 3 — x 2 + x + 5, then r(x) = 5, 
since by long division 

3x 9 + x 8 — x 7 — 3x 3 — x 2 + x + 5 = (3x 2 + x — l)(x 7 — x) + 5. 

In particular, the polynomial function determined by fix) = 3x 9 + x 8 — x 7 

— 3x 3 — x 2 + x + 5 takes the value 5 for every ce Z 7 . The reader may check 
(in case there is any doubt) that / = r in each of the foregoing examples, by 
comparing/(c) and r(c) for each ce Z 7 . 

In the preceding examples, we found r(x) by taking the remainder when/(x) 
is divided by x p — x . Another way to obtain r(x) is to start with f(x ), and in 
any power of x greater than or equal to 7 replace x p by x; and then keep going 
until only powers of x less than the 7th remain—for example, x 10 — 2x 9 + 3x 8 

— x 7 = x 7 • x 3 — 2x 2 • x 7 + 3x • x 7 — x 7 x • x 3 — 2x 2 • x + 3x • x — x = 
x 4 — 2x 3 + 3x 2 — x. 


3-5-15. Exercise. We have seen in 3-5-8 that when the field F is infinite, 
the homomorphism 


A: F[x] ->Map(F, F) 
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is injective. On the other hand, when F is the finite field Z p , the homo¬ 
morphism 

A: Z p [x] -> Map( Z p , Z p ) 

is not injective—for the nonzero polynomial x p — x is clearly in the kernel of 
A. 

Even more, if /(x) e Z p [x], then, with the notation of 3-5-14, 
r(x) = 0 o (x p — x) | /(x) 

and it follows that 

/ = 0 e Map( Zp, Zp) O (x p - x) I f{x). 

Consequently, ker A consists of all multiples of x p — x. (Note that the 
polynomial 0 is also a multiple of x p — x.) In particular, for/(x), g(x) e Z p [x], 
we have 

1=9 o (x p - x) | (f{x) - g(x)). 

We have seen in 3-5-5 that a polynomial of degree n over a field can have 
at most n distinct roots in the field. When the field in question is the finite 
field Z p we can characterize those polynomials of degree n that have exactly 
n distinct roots. 

3-5-16. Proposition. If /(x)e Z p [x] is of degree n > 1, then /(x) has 
exactly n distinct roots in Z p /(x) | (x p — x). 


Proof : Let g(x) be the unique monic associate of f(x ). Then deg#(x) = 
g(x) has precisely the same roots as f(x), and g(x) \(x p — x) if and only if 
f(x) | ( x p — x). Therefore, it suffices to prove our assertion for g(x). 

If g(x) has the n distinct roots c u c 2 ,..., c n in Z p , then in virtue of 
3-5-4, 

g(x) = (x- q)(x - c 2 ) • • • (x - c n ). 

Now, c u c 2 ,..., c n are distinct elements from the set Z p = {0, 1 , 1 } 
so n < p and, even more, each (x — c ( ) appears exactly once on the right-hand 
side of 


x p — x = x(x — l)(x — 2 ) • • • (x — (p — 1 )). 

Consequently, g(x) \ (x p — x). 

Conversely, if g(x) \ (x p — x), then there exists h(x) e Z p [x] for which 

g(x)h(x) = x p — x — x(x — l)(x — 2) • • • (x — (p — 1 )). (*) 
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Now, g(x) and h(x ) may be factored into primes of Z p [x]. But the prime 
factorization of g(x)h(x) is given by the right side of (*). Since g(x) is monic of 
degree «, h(x) is clearly monic of degree p — n. Thus, by uniqueness of the 
prime factorization, g(x) is the product of n of the linear factors on the right 
side of (*) [and h(x) is the product of the remaining p — n terms on the right 
side of (*)]; hence, it follows that g(x) has n distinct roots. | 

3-5-17. Examples. Let us illustrate the preceding result by examining 
several polynomials in Z n [x] and deciding if they have the maximum possible 
number of distinct roots in Z p (namely,«, when the polynomial has degree n). 

(1) The polynomial 

fix) = x 12 — 3x 7 + 4x 2 + 2e Z n [x] 

cannot have 12 distinct roots in Z n , since Z n has only 11 distinct elements. 
Of course,/(x) does not divide x 11 — x—its degree is too high—so again /(x) 
does not have 12 distinct roots. How many distinct roots does /(x) have? At 
this stage, we can do little more than test all the elements 0, 1, 2,..., 10 of Z p 
to see if they are roots. Note that since 

/(x) = x(x n — x) + ( —3x 7 + 5x 2 + 2) 

the roots of/(x) are the same as those of r(x) = — 3x 7 + 5x 2 + 2. 

(2) The polynomial 

/(x) = x 9 + x 8 + x 7 + x 6 + x 5 + x 4 + x 3 + x 2 + x + 1 e Z n [x], 

being of degree 9, can have at most nine distinct roots in Z u . But/(x) divides 
x 11 — x; in fact, 

x 11 — x = (x 2 — x)/(x) = x(x — l)/(x) 

—so f{x) has nine distinct roots; and obviously the roots are 2, 3, 4, 5, 6, 7, 
8, 9, 10, since from the factorization of x 11 — x into 11 linear terms it follows 
that 

f{x) — (x — 2)(x — 3)(x — 4)(x — 5)(x — 6)(x — 7)(x — 8)(x — 9)(x — 10). 

The reader may multiply this out to check it, and also for practice computing 
in Z n [x]. 

(3) The polynomial 

fix) = x 2 + x + 3 e Z n [x] 

has at most two distinct roots in Z n . By long division, it is not hard to see 
that fix) does not divide x 11 — x; so/(x) does not have two distinct roots. If 
we test all 11 elements of Z u , it turns out that 5 is a root. [Once 5 has been 
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found to be a root, there is no need to test for other roots because we have 
already shown that /(x) does not have two distinct roots.] Then /(x) factors in 
ZnM as 

fix) = (x - 5)(x - 5) = (x - 5) 2 . 

Thus, 5 should be considered as a 66 double ” root of f(x )—there are really two 
roots, but they are equal. 

(4) Consider the polynomial 

f{x) — 3x 2 — x -f 3 e Z ljL [x]. 

Its unique monic associate is (since 4 is the multiplicative inverse of 3 in Z n ) 
g(x) = 4/(x) = x 2 - 4x + 1 e Z ljt [x]. 

Of course, the roots of g(x) are the same as those of f(x). By long division, one 
sees that g(x) | (x 11 — x), so g{x) has two distinct roots in Z n . In fact, by trial 
and error, g(x) has the two roots 7, 8 and 

g(x) = (x- l)(x - 8). 

Naturally,/(x) also has the two roots 7, 8 and 

f(x) = 3 gix) = 3(x - 7)(x - 8 ). 

(5) The polynomial 

fix) = x 2 — 4x — 4 e Z n [x] 

does not divide x 11 — x, so it does not have two distinct roots. In fact, by 
trial and error,/(x) has no roots in Z n . Thus,/(x) has no linear factors; and 
because /(x) has degree 2 , we conclude that it is irreducible. 

( 6 ) Consider the polynomial 

fix) = x 3 — 5x 2 + 5x — 9 e Z u [x]. 

It has at most three distinct roots in Z n . Because fix) fix 11 — x) (which is 
left to the reader to check),/(x) does not have three distinct roots. How many 
roots does/(x) have? Clearly, 0 and 1 are not roots of/(x), but 2 is a root. 
Therefore, (x — 2) divides /(x); in fact, we have 

/(x) = (x — 2)(x 2 — 3x — 1) e Z u [x]. 

Now, it is easy to see that x 2 — 3x — 1 has no roots in Z n (so it is irreducible). 
Consequently,/(x) has the single root, 2. 

According to 3-5-12 or 3-5-16, x p_1 — 1 has p — 1 distinct roots in Z p . 
However, a somewhat stronger result holds: 
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3-5-18. Proposition. If p is prime and d\(p — 1), d>0, then the poly¬ 
nomial x d — 1 has exactly d distinct roots in Z p . 


Proof: Let us write p — 1 = dt and make use of 3-5-16. Thus, x d — 1 
divides x p — 1 , since 

= x [(x d y - 1 ] 

= x(x d - mxy- 1 + c x d y- 2 + --- + x d + \]. 

This completes the proof. | 

In virtue of this result, the polynomial x 9 — 1 e Z 127 [x] has exactly nine 
roots in Z 127 (for p = 127 is prime and d = 9 divides — 1 = 126)—but we 
do nothing about trying to locate them. By the same reasoning x 2 — 1 has two 
distinct roots in Z 127 , and x 1 — 1 has seven distinct roots in Z 127 . 

Now that we have learned a few facts about polynomials over Z p , let us 
have a look at polynomials of the type familiar to us from high school 
algebra—namely, those with coefficients from the fields Q, R, or C. Of course, 
the same polynomial may be viewed over different fields, and its roots and 
factorization depend on the choice of the field in question. For example, 
consider the polynomial 


x 2 + 1 . 

Viewing it in Q[x]—clearly, it has no roots in Q (since there is no rational 
number whose square is — 1) and thus x 2 + 1 is irreducible over Q. Viewing 
it in R[x ]—there are no roots in R and x 2 -F 1 is irreducible over R. Viewing it 
in C[x] —clearly, ± i are roots in C (in fact, the reason why one enlarges the 
reals, R, to form the complexes, C, is precisely to have a field in which x 2 -F 1 
has a root) and we have the factorization 

x 2 + 1 = (x — i)(x -F i ) 


over C. 

As another example, consider the polynomial 

x 2 — 2 . 

As a polynomial in Q[x], it has no roots since there is no rational number 
whose square is 2. (This is the familiar statement that “ the \Jl is irrational,” as 
observed in Miscellaneous Problem 20 of Chapter I.) Thus, x 2 — 2 is irreduc¬ 
ible over Q. As a polynomial in R[x], x 2 — 2 clearly has the roots ±\!2 (note 
that V 2 is a perfectly good real number; it is the number associated with the 
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hypotenuse of a right triangle both of whose legs have length 1 ), and we 
have the factorization 

x 2 — 2 = (x — V 2 )(x + \!l) 

over R. Of course, the story over C now goes exactly like the story over R— 
x 2 —2 has the roots ±V2in C, and its factorization in C[x]is (x - V 2)(x + V 2 ). 

We begin our discussion of high school-type polynomials by considering 
polynomials over C. The key fact here is a theorem of immense importance 
whose proof is unfortunately far beyond the scope of this book—namely: 

3-5-19. Fundamental Theorem of Algebra. Any polynomial f(x) e C[x] of 
degree 1 or more has a root in C. 

It may be remarked that every known proof of the fundamental theorem 
of algebra, and there are many of them, makes use of nonalgebraic properties 
of C. Note that although the theorem asserts the existence of a root for every 
polynomial in C[x] it says nothing about how to find such a root. Nevertheless, 
the mere fact of existence of a single root has important consequences. Thus, 
suppose c t e C is the root of f(x) specified by the theorem. Then (x — c t ) 
divides f(x); so there exists a polynomial/i(x) e C[x] of degree n — 1 such that 

fix) = (x- cjfiix). 

If n — 1 > 1, then the fundamental theorem of algebra applies to fi(x); so 
there exists c 2 eC which is a root of f t (x ), and we can then write/i(x) = 
(x — c 2 )f 2 (x) where f 2 (x) e C[x] is of degree n — 2 —and hence 

fix) = ix- c x )(x - c 2 )f 2 ix). 

If deg/ 2 (x) = n — 2 is still greater than 1, the process may be repeated. This 
procedure keeps going (the reader may set up a formal induction if he wishes) 
until one arrives at: 

3-5-20. Theorem. Any polynomial f(x) = a 0 + a t x + • • • + a n ^ e C[x] of 
degree n > 1 factors into n linear factors; more precisely, there exist 
c 1 ,c 2 ,...,c„eC such that 

fix) = a„ix - c t )(x - c 2 ) • • • (x - c„). (*) 

Another way to express this result is: A polynomial f(x) e C[x] of degree 
n > 1 has n roots in C. But note that these n roots c u c 2 ,..., c n need not be 
distinct; this depends on whether or not the factors (x — c ( ), i = 1 ,..., n are 
distinct. 
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Now, according to 3-4-16, since C is a field, our f(x ) e C[x] can be expres¬ 
sed uniquely (up to order) as a product of a nonzero constant and a finite 
number of monic irreducible polynomials over C. But any linear term of form 
x — c t is monic and irreducible, so 

fix) = a„(x - C,)(x - c 2 ) • • • (x - c„) 

is the 66 factorization into primes”! In particular, it follows that a polynomial 
f(x) of degree 2 or more cannot be irreducible—its reducibility is displayed 
by its factorization into linear factors. We conclude: 


3-5-21. Corollary. A polynomial in C[x] is prime (that is, irreducible) if 
and only if it is of degree 1 . 


According to this result 

{x — c | c e C} 

is the set of all monic irreducible polynomials over C. Furthermore, these 
primes of C[x] are distinct in the sense that if c ^ c\ then the primes x — c and 
x — c' are not associates. (In particular, the number of primes in C[x] is 
infinite.) On the other hand, an arbitrary irreducible polynomial, being of 
form ax + b, a, b e C, a ^ 0, is an associate of one of these—namely of 
a" 1 (ax -F b) = x — ( —bja ). 


3-5-22. Remark. In virtue of the preceding discussion, we know that any 
quadratic polynomial 

f{x) = ax 2 + bx + ce C[x] 

has two roots in C (that is,/(x) factors into the product of a and two monic 
polynomials of degree 1). These roots may be found explicitly by the method 
which is often referred to as “completing the square”—it goes as follows: 

z 0 e C is a root of f(x) o f(z 0 ) = az\ + bz 0 + c = 0 


<=> 

o 

o 

o 


o 


4a 2 zl + 4 abz 0 + 4 ac = 0 
4a 2 Zq + 4 abz 0 = —4 ac 
4a 2 Zq + 4 abz 0 A-b 2 — b 2 
(2 az 0 + b) 2 = b 2 — 4 ac 
2 az 0 + b = ±sjb 2 — 4 ac 
— b± yjb 2 — 4 ac 


(since a ^ 0 ) 

4 ac 


2 a 
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Of course, this result is known as the 66 quadratic formula it says that the 

two roots of ax 2 + bx + c are 

— b + yjb 2 — 4 ac —b — - N /b 2 — 4 ac 

2 a ’ la 

and its factorization is 

+ te + , - - (- - t + ^ )] [' - { - b -^ )\ 

However, one matter of concern remains—does the square root of the 
complex number b 2 — 4ac really exist, and can we find it ? More generally, 
given any complex number s -F ti (s and t are real) do there exist any complex 
numbers whose square is s + ti , and can we find them ? This amounts to 
finding the roots of the polynomial 

x 2 — (s + ti) 

over C. According to 3-5-20, this polynomial has two roots in C. If u + vi 
(with w, v e R) is a root, then 

(u + vi) 2 = s + ti (*) 


and clearly — (u + vi) is also a root. Thus, u -F vi and — (u + vi) are the two 
roots, and the factorization is 


x 2 — (s + ti) — [x — (u -F vi)][x -F (u -F vi)]. 


The notation ±y/s -F ti refers to the two roots of x 2 — (s -F ti). It does not 
matter which element of the pair {u -F vi, — u — vi} we take to be +\/s + ti 
and which one to be — yjs -F ti. [Since C cannot be ordered (see 2-3-2) there is 
no satisfactory notion of positivity in C, so we are in no position to speak of 
the 66 positive square root ” of s -F ti.] Of course, when s + ti is a positive real 
number—that is, when t = 0 and s > 0 —we still take \Js + ti = \Js to mean 
the positive square root—that is, the positive real number whose square is s. 

What about finding the two square roots of s + ti explicitly? From (*) 
we have the two equations 

u 2 — v 2 — s, 

2 uv = t. 

If t = 0 (that is, when s + ti is real), then—if s > 0, we take v = 0 and u = \Js 
(the usual positive square root of the positive real number 5 ), and if s < 0, we 
take u = 0 and v = (V —s). Of course, the case t = 0, s = 0 is trivial as u = 
v = 0 will do. 

Now, if 1 7 ^ 0, then u # 0 and v # 0 (since 2 uv = t ), so we may substitute 
v = tjlu in the first equation. The end result is that one solution u + vi of 
x 2 —(s +ti) is given by 


_ I s + \f' 


s 2 + t 2 


v = 


y/2(s + yj s 2 + t 2 ) 


(#) 
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and the other one is then — u — vi. Of course, all the square roots that occur 
here are positive square roots of positive real numbers (for when t # 0, 
s 2 + t 2 > 0 and s + \!s 2 + t 2 > 0. 

As an example of all this, consider the polynomial 
f(x) = x 2 + x + (1 + /) e C[x], 

By the quadratic formula, the two roots are 

-1 ± V-3 — 4/ 

2 ' 

As for ±V— 3 — 4 i, these are the two roots u + vi, — (u + vi) of x 2 + 3 + 4i. 
The reader may compute u and v from scratch, or simply substitute in (#)— 
thus obtaining 

u = 1, v = —2. 

Hence, the roots of f(x) are 

-1 ±(1-2/) 

2 

—that is, — / and — 1 + /. 

Next, let us consider any polynomial f(x) e R[x] of degree 1 or more and 
with leading coefficient a n # 0. From the foregoing, we know that when 
f{x) is viewed in C[x] it factors in the form 

fix) = a„(x - CjX* - C 2 ) • • • (x - c„), c 1 ,...,c„eC. 

Some of the roots c l9 ..., c n may be nonreal complex numbers and others may 
be real. Concerning the nonreal roots, we have: 

3-5-23. Proposition. If f(x) = a 0 + a x x + - - - + a n x n e R[x] is a polynom¬ 
ial of degree n > 1 and c e C is a nonreal root of /(x), then its complex 
conjugate c is also a root of f(x). 

Proof : This is the familiar statement that the complex (nonreal) roots of 
a real polynomial occur in conjugate pairs. The proof is easy; it makes use of 
such basic properties of conjugation in C as: the conjugate of a sum is the 
sum of conjugates, the conjugate of a product is the product of the conjugates, 
a real number is its own conjugate—namely, 

c is a root of f(x) => a 0 + a t c + a 2 c 2 + • • • + a n c n = 0 
=> a 0 + a t c + a 2 c 2 + • • • + a n d 1 = 0 
=> a 0 + a^c + a 2 c 2 + • • • + a n P 1 = 0 
=> a 0 + a t c + a 2 c 2 + • • • + a n c n = 0 
=> a 0 + a t c + a 2 c 2 + • • • + a n c n = 0 
=>/(c) = 0 

=> c is a root of f(x). | 
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Consequently, if c e C is a nonreal complex root of 
fix) = a„(x - cjix - C 2 ) • • • (x - c„), 

say c = s + ti,s, t e R,t ^ 0, then so is c = s — ti , and both (x — c) and ( x — c) 
appear among the factors of f(x). Taking both of these, we have 

(x — c)(x — c)~ x 2 — (2 s)x + (s 2 + t 2 ). 

Since 2s and s 2 -F t 2 are real numbers, the right side is a quadratic polynomial 
in R[x]. Furthermore, it is irreducible over R—for it has no roots in R, the 
two roots c and c being nonreal complex numbers. 

In this way, each real root of f(x) determines a linear factor (in R[x ]) of 
f(x) 9 and each pair of conjugate complex roots determines an irreducible 
quadratic factor (in R[x ]) of f(x). We have proved: 


3-5-24. Theorem. Any polynomial f(x) = a 0 -F a t x -F • • • + a n ^ e R[x] of 
degree n > 1 factors into a product of a n , monic linear polynomials in 
RM, and monic irreducible quadratic polynomials in R[x]. 


Obviously, this factorization of f(x) is its “ unique factorization into 
primes” in R[x], whose existence and uniqueness were guaranteed in 3-4-16. 
We also note that according to this result a polynomial of degree 3 or more in 
R[x] factors into linear and/or quadratic terms, so it cannot be irreducible. 
Even more, we have: 


3-5-25. Proposition. A polynomial f(x) e R[x] is irreducible f(x) is of 
two types: 

(0 /(*) i s linear [that is, deg/(x) = 1], 

(i7) f(x) is quadratic [that is, deg/(x) = 2] with negative discriminant. 


Proof : We have already observed that a polynomial of degree 3 or more 
cannot be irreducible in R[x] and, of course, any polynomial of degree 1 is 
irreducible. It remains, therefore, to decide under what circumstances a 
polynomial of degree 2, 

f(x) = ax 2 + bx + ce R[x], a ^ 0, 

is irreducible. The criterion will be based on the discriminant of f(x ), which is 
defined to be b 2 — 4 ac (which is an element of R). 

The only way f{x) can factor in R[x ] is as the product of two linear poly¬ 
nomials. Therefore,/(x) is irreducible in R[x] o it has no real root. Applying 
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the method of completing the square (as in 3-5-22) to f{x) = ax 2 + bx + c we 
see that its roots (about which we can only be certain, in general, that they are 
in C) are 


— b± y/b 2 — 4ac 


2 a 


a, b, c e R. 


Thus, there are no real roots of f(x) <=> the discriminant b 2 — 4ac is less than 
0. This completes the proof. | 

In virtue of this result, the polynomials x — \/2, V3 x — rr, lx + 9 in 
R[x] are irreducible because they are linear. The quadratic polynomials 
x 2 + lx — 9, x 2 — 7ix — >/2, V3 x 2 + (2 + V 3)x + yjl in R[x] are reducible 
over R because in each case the discriminant b 2 — 4 ac is greater than 0. On the 
other hand, each of the quadratic polynomials x 2 — 5x + 7, x 2 -F nx + 2\/7, 
V3 x 2 — (l + y/3)x + V2 in R[x] is irreducible over R because the discriminant 
b 2 — 4ac is less than 0. 

We conclude our remarks about polynomials with real coefficients by 
noting a simple consequence of 3-5-24. 


3-5-26. Corollary. A polynomial f(x) e R[x] of odd degree n has at least 
one real root. 


Proof : Because n is odd, the factorization of /(x)intoprimes cannot consist 
solely of quadratic terms. There must be at least one linear factor; hence, there 
is at least one real root. | 

When we turn to polynomials in Q[x], the results are of a different type. 
For polynomials in C[x], the form of their factorization into primes has been 
determined completely, but we have little information about finding the roots 
explicitly. [Actually, we have seen how to find the roots of any f(x) e C[x] of 
degree 2; in addition it is well known that if f(x) e C[x] is of degree 3 or 4, 
then its roots can be found explicitly—in fact, the roots can be expressed in 
terms of “radicals.” It is only when deg/(x) > 5 that there is no general 
method which always locates the roots.] Furthermore, these same statements 
apply for polynomials in R[x ]. On the other hand, for arbitrary f(x) e Q[x] 
there is not much we can say about its prime factorization, but it turns out to 
be relatively easy to find all its roots in Q. 

To see this, consider f(x) = a 0 + a t x + • • • + a n x n e Q[x], Since each a { 
is in Q it can be expressed as the ratio of two integers. If we multiply f(x) by 
the least common multiple of the denominators of all the a t —call it m —the 
result g(x) = mj\x) is a polynomial whose coefficients are integers and which 
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has the same roots as f(x). Thus, when we want to discuss the roots of f(x) e 
Q[x] there is no harm in assuming that f(x ) has coefficients in Z— 

f(x ) = a 0 + a t x + • • • + a n x n e Z[x]. 

In addition, if the gcd of a u a 2 ,..., a n is greater than 1, then it may be fac¬ 
tored out, leaving a polynomial in Z[x] for which the gcd of all its coefficients 
is 1, and whose roots are the same as those of f(x). Thus, we may assume, if 
necessary, that f(x)e Z[x] is primitive —by which is meant that (. a 0 , a u ... ,a n ) 
= 1, or what is the same, the coefficients of f(x) are relatively prime. 

For example, to find the rational roots of 

l v 5 3 V 4 , 5_ v 3 | ]_-y.2 9 , 5_ 

3 a 2 X ~r gX 1 1 2 -^ ^ 6 » 

we may work with the polynomial 

8x 5 - 36x 4 + \5x 3 + I4x 2 - 48x + 20, 
and to find the rational roots of 

54x 5 - 2lx 3 + 102x 2 - 9x + 24, 
we may work with the primitive polynomial 

18x 5 — lx 3 + 34x 2 — 3x + 8. 


3-5-27. Proposition. Consider a polynomial 

f(x) = a 0 + a t x + Z[x] 

of degree n > 1. If r/s is a rational root in lowest terms [meaning that r 
and s are integers with (r, s) = 1], then r \ a 0 and s\a n . 


Proof : Since r/s is a root, we have 




+ a n 


= 0 


and multiplying by s n gives 

flos" + a^rs^ 1 + • • • + a n ^ 1 r n ~ i s + a n r n = 0. 


(*) 


Except for the first term, all the terms on the left side are obviously divisible 
by r . Since 0 is divisible by r, it follows that the first term a 0 s n is divisible by 
r. Because (r, s) = 1 we know that (r, s") = 1, and therefore r | a 0 . 

By applying the same technique to (*), we also obtain s \ a n . | 

According to this result, when a polynomial with integer coefficients is 
given then there are only a finite number of rationals that can possibly be roots 
—namely, those of form r/s where (r, s) = 1, r divides the constant term of 
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fix), and s divides the leading coefficient of fix) (and, of course, we can insist 
that s be positive, but r may be negative). For example, for the polynomial 

f{x) = 6x 5 + 19x 4 + 31x 3 + 17x 2 - lx - 6 e Z[x] 

the only possible rational roots are: ±1, ±2, +3, ±6, +-J-, ±f, ±-j, 

(since a 0 = — 6, <z 5 = 6). By testing these, we find all the rational roots of fix); 
they are: — 1, and the factorization of f(x) is 

(2x - l)(3x + 2)(x + l)(x 2 + 2x + 3). 

Note that this is the prime factorization of f(x) e Q[x] (after all, x 2 -f 2x + 3 
is irreducible in R[x], so it is surely irreducible in Q[x ]). 

The problem of deciding whether an arbitrary polynomial in Q[x ] is 
irreducible is essentially unsettled. However, there is one result, due to Eisen- 
stein (1823-1852), that enables us to recognize, or produce, many irreducible 
polynomials over Q. 

3-5-28. Eisenstein’s Criterion. Suppose 

f{x) = a 0 + a t x + • • • + a n y? e Z[x] 

is a polynomial of degree n> 1, and suppose there exists a prime p for 
which 

(i) p | a { for i — 0, 1,..., n — 1, 

(//) pXa n , 

(Hi) P 2 Xa 0 . 

Then,/(x) is irreducible over Q. 


Proof : We merely sketch the proof (which goes in several steps) and leave 
the details as an exercise for the reader. 

(1) Show that f{x) is irreducible over Z. In fact, suppose 

fix) = ib 0 + b t x + • • • + b r x'Xcq + c^x + • • • + CsX*) 

where all the coefficients are integers, r> 1, 1, r + s = n. Since p | a 0 , 

p 2 fa 0 , and a 0 = b 0 c 0 exactly one of b 0 , c 0 is divisible by p. Say, p | b 0 and 
p)(c 0 . Now, pfb r , since p does not divide a n = b r c s . Let t < r < n be the 
smallest positive integer for which pfb t . It follows that pfa t —contradiction. 
It remains to show, therefore, that if a polynomial is irreducible in Z[x], then 
it is irreducible in Q[x]. 

(2) The product of two primitive polynomials (in Z[x]) is primitive. For this 
suppose the product is not primitive, so there exists a prime p which divides 
all its coefficients. Now apply 3-4-17, Problem 30 to obtain a contradiction. 
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(3) Prove Gauss’ lemma which says: If f(x) e Z[x] of degree 1 or more 
factors over Q, then it factors over Z—in other words, if there exist g(x ), 
h(x) e Q[x] for which f(x) = g(x)h(x\ then there exist g'{x) y h'(x) e Z[x] for 
which f(x) = g'(x)h'(x). 

To do this, there is no loss of generality in assuming, at the start, that f(x) 
is primitive. Beginning from f(x) = g(x)h(x) one may clear of fractions and 
then factor out the gcd of the coefficients for each of the polynomials on the 
right side. We finally have an equation of form 

sf(x) = rg'(x)h’(x) 

where r,se Z, g'(x ), h'(x) e Z[x ], and g\x ), h\x) are both primitive. Then 
r = s, which completes the proof. | 

According to this result, the polynomials 

8x 5 + 36x 4 + 15x 3 - 48* + 21 and 3x 5 - 18* + 34x 2 - Sx + 42 

are irreducible over Q—the first one satisfies the Eisenstein criterion for p — 3 
and the second one for p — 2. The reader can now surely construct irreducible 
polynomials over Q at will. 

Incidentally, x 2 — 2 satisfies the Eisenstein criterion for p = 2, so it is 
irreducible over Q. Hence x 2 — 2 has no root in Q; or, to put it another way, 
there is no rational number (that is, element of Q) whose square is 2. We have 
proved \j~2 is irrational. 

Furthermore, for any prime p and any n > 2 the polynomial 

x n — p 

is irreducible over Q (since it is Eisenstein). Therefore, for any integer n > 2 
there exists an irreducible polynomial over Q of degree n —in fact, there 
exist an infinite number of irreducible ones of degree «, because there are an 
infinite number of choices for the prime p . 

3-5-29 / PROBLEMS 

1. Use Lagrange interpolation to find a polynomial f{x) e R[x] for which 

(0 /(2)_= 5,/(3) = 8, («) /(-1) = 7,/(l) = 7, 

(Hi) fU 2 ) = 7t,/(V 3 ) = n 2 . 

What is really going on here? 

2. Use Lagrange interpolation to find f(x) e Q[x] of degree less than or equal 
to 2 for which 

(0 /(0) = 1,/(1) = 2,/(2) = 4, 

00 /( — 2) = 0,/(2) = 3,/(4) = 6, 

(Hi) /(— 1) = 2,/(l) = l,/(2) = 3. 
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3. Do each part of Problem 2 when the field Q is replaced by the field 
(0 R, 0*0 Z 5 , (iii) Z 7 , (iv) Z n . 

4. We know that there exists a unique polynomial 

f(x ) = <z + Z?x + ex 2 + </x 3 e R[x] 

of degree less than or equal to 3 whose graph passes through the four points 
(-1,5), (0,1), (1,5), (2,19). 

Find a , b , c, dby the method of undetermined coefficients. Then find/(x) 
in another way—namely, by use of Lagrange interpolation. 

5. Why does Lagrange interpolation not work when the field F is replaced 
by an integral domain D ? 

6. Find the unique polynomial /(x) of degree less than or equal to 3 in 
Z 5 [x] for which 

/(1) = 4, /(2) = 2, /(3) = 1, /(4)=1. 

7. Do Problem 6 when Z 5 is replaced by 

(0 ^7 9 00 z u . 

8. What is the remainder upon division of 3x 5 -F x 4 -F 2x 3 + 4x 2 + 5x -F 6 
by x — 3 when these polynomials are viewed in 

(0 Z[X], 00 Q[x], (iii) Z 2 [x], 00 Z 3 [x], 

(v) Z 5 [x], (vi) Z 6 [x], (wi) Z 7 [x], 0«7) Z n [x]? 

9. Give a proof of 3-5-4 based on the least common multiple of x — c l9 
x — c 2 ,..., x — c n in F[x]. 

10. (i) Suppose f(x),g(x)eF[x] are of degrees m and «, respectively, and 

the number of elements in the field F is greater than max{m, n }. If 
f(c) = g(c) for all cef, then/(x) = g{x). 

07) Is this result valid if F is replaced by an integral domain D1 

11. In the text, we proved 3-5-4, 3-5-6, 3-5-7 over a field F. When F is replaced 
by R , an arbitrary commutative ring with unity, each of these results fails. 
For each result, give an example of a ring R for which it is false. 

12. Consider the polynomials 

{a) x 3 + x + 3, (b) x 4 — 5x 3 + x 2 — 3x + 2, (c) x 7 — x, 

(d) x 4 + 3x 2 + 2, (e) x 4 - 1, (/) x 4 + 1. 

For each one find all its roots when it is viewed as a polynomial over the field: 
0) Z 2 , 07) Z 3 , 077) Z 5 , 

(iv) Z 7 , (v) Z u , (vi) Z 13 . 

In each case (there are 36 cases: six polynomials and six fields) give the 
prime factorization. 
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13. Decide if each of the following polynomials is irreducible over 
(a) Q, (b) R, (c) C 
and, if not, factor it when possible. 

(i) x 2 — 13, (ii) x 2 + 13, 

(iv) x 2 + x + 1, 

(vi) x 3 - 2, 

(viii) x 3 + x + 1, 

(x) x 3 + 2x 2 — 3x — 1, 

(xii) x 3 + 2x 2 — 5x — 6. 


(iii) x 2 — 5x + 6, 
(v) x 2 + 2x — 2, 
(vii) x 3 + 2, 

(ix) x 3 + x 2 + 1, 
(xi) x 3 + x + 2, 


14. Treat the polynomials of Problem 13 in the same way over the fields: 

(0 Z 2 , (ii) Z 3 , (iii) Z 5 , 

(iv) Z 7 , (i?) Z n , (i?0 Z 17 . 

15. 0 ) irreducible polynomials of degree less than or equal to 3 in 

z 3 M. 

(//) Find all irreducible polynomials of degree less than or equal to 5 over 
the field Z 2 . There are 10 of them. 

16. The number of monic polynomials of degree 2 over the field Z p is p 2 ; 
show that (p 2 — p)/ 2 of them are irreducible. 

17. Find the square roots of the following complex numbers: 

(/) i, (ii) — i, (iii) 3 — 4i, (iv) 5 + 12/. 

18. Solve the following quadratic polynomials over C: 

(0 x 2 + ix + (1 — i), (ii) 2x 2 — \/3x + (2 + f), 

(iii) x 2 + (1 + 0* + (3 + 4/), (iv) (3 — 0* 2 + (1 + 2 i)x — (2 + 3i). 

19. Find all rational roots of the following polynomials: 

(i) x 4 + x 3 + x 2 + x + 1, 

(ii) x 5 + x 4 + x 3 + x 2 + x + 1, 

(iii) x 6 + x 5 + x 4 + x 3 + x 2 + x + 1, 

(iv) x 50 - x 20 + x 10 — 1, 

(v) x 100 -x 50 + 1, 

(vi) 9x 4 + 6x 3 + 19x 2 + 12x + 2, 

(vii) x 11 + 2x 9 - 2, 

(viii) x m + 3X" 1-1 — 3, m > 1, 

(ix) 4x 3 + 3x 2 + 6x + 12, 

(x) 5x 3 — x 2 — 20x + 4. 

20. Find the prime factorizations of the following polynomials over C and 
over R: 

(/) x 3 + 1, (ii) x 4 + 1, (iii) x 5 + 1. 
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21. (/) Find a polynomial of degree 4 in Z 13 [x] which has four distinct roots. 
(ii) Does x 4 — lx 3 — lx — l e Z 13 [x] have four distinct roots? 

(iff) What about 3x 4 — x 3 -f 2x 2 + 5x — 3 e Z 13 [x]? 

22. Which of the following polynomials are irreducible over Q ? 

(/) x 5 + 3x 3 + 6x + 3, (ii) x 5 — 3x 3 + 3x 2 — 9, 

(iff) x 5 + 6x 3 — 3x + 15, (iv) x 7 — 91. 

23. (i) For any ae F, f(x) is irreducible over F o f(x + a) is irreducible 

over F. 

(ii) Show that x 2 + 1 is irreducible over Q by making use of the Eisenstein 
criterion. 

24. Use Wilson’s theorem to prove that if p is a prime = 1 (mod 4), then the 
congruence 

x 2 + 1 = 0 (mod p) 

has a solution—that is, there is an element of Z p whose square is — 1. 

25. Suppose both of the polynomials /(x), g(x) e Q[x] have the real number 
a as a root. Show that f(x) and g(x) are not relatively prime. 

26. Suppose f(x) e F[x] is of degree greater than or equal to 1; then, f(x) is 
irreducible if and only if it satisfies the following property: if / (x) | g(x)h(x) 
and f(x)Xg(x\ then f(x) \ h(x). (That is, if f(x) divides a product, then it 
must divide at least one of the factors.) 

27. The polynomial f(x) = x 3 — x — 1 is irreducible over Z 2 . Take a formal 
symbol a and make believe it is a root of f(x) —so a 3 — a — 1 = 0 or 
a 3 = a + 1. Let Z 2 [a] denote the set of all formal expressions 

a 0 + a t a 4* a 2 a 2 , a 09 a u a 2 e Z 2 . 

Define addition in Z 2 [a] by 

(a 0 + + a 2 a 2 ) + ( b q + b^oc + b 2 oc 2 ) 

= (a 0 + b 0 ) + (a t + a + (a 2 + b 2 ) a 2 . 

Define multiplication in Z 2 [a] by multiplying out as for polynomials and 
replacing higher powers of a than the second by combination of lower 
powers—that is, a 3 = a + 1, a 4 = a 2 + a. 

Show that Z 2 [a] is a commutative ring with unity, having eight 
elements, and containing Z 2 . 

Even more, Z 2 [a] is a field. One way to prove this is to simply look at 
the multiplication table for Z 2 [a]. Another, more sophisticated way, is 
as follows: Given a 0 + a t a + a 2 a 2 ^ 0 in Z 2 [a], write g(x) = a 0 + a t x 
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+ a 2 x 2 e Z 2 [x]. Since f(x) is prime, clearly (g(x),f(x)) = 1, so there 
exist u(x ), v(x) e Z 2 [x ] c= ( Z 2 [oc])[x] such that 

g(x)u(x) +f(x)v(x) = 1. 

Then, substituting a, we obtain g(a)u(a) = 1, so u{ a), which is in Z 2 [a], 
is a multiplicative inverse of g{oc) = a 0 + + a 2 a 2 . 


3-6. Solving Polynomials in Z m [x] 

In this section we finally learn how to go about solving an arbitrary con¬ 
crete equation of form 

a„ + • • • + oqx + a 0 = 0, a 0 , oq, ..., a„ e Z m 

—or, as we find it more convenient to write, 

cc o + cc i x+--- + a n xT = 0 9 a 0 , oq,..., oc n e Z m . (*) 

A small part of this problem was treated, with complete success, in Section 
3-1—namely, the case where the equation is linear (that is, where n = 1). Here 
we place no restrictions on the positive integer n. Of course, by a solution of the 
equation (*) we mean an element ye Z m for which a 0 + oqy -f • • • + oc„ y n = 0 
(or, as it is commonly expressed: y satisfies the equation); and then the phrase 
“ solving the equation (*) ” means finding all the solutions of (*). 

Equivalently, if we consider the polynomial 

f{x) = a n x n + a,,-!*" -1 + • • * + cc ± x + a 0 

= a 0 + + • • • + a n x n (**) 

= 

i=0 

in Z m [x] y then a solution of (*) is the same as a root of the polynomial f(x) —so 
solving the equation (*) amounts to finding all the roots (in Z m ) of the poly¬ 
nomial f(x). 

Strictly speaking, the problem of finding all the roots of f(x) can be settled 
in a finite number of steps; in fact, there are m choices for y e Z m and for each 
of them we simply evaluate f(y) to decide if it is a root. But this is not what 
we have in mind; there is no mathematics involved, and anyway m may be 
very large. Instead, we develop a method which, in most cases, reduces con¬ 
siderably the amount of work involved. To do this, it is convenient to re¬ 
formulate our problem in terms of integers (rather than residue classes) and 
congruences. Thus, for each i— 1,...,«, let a t be an integer for which 
| dj L = a i , and consider the congruence 

a 0 + a t x + • • • + a n x n = 0 (mod m). (***) 
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By a solution of this congruence we mean, as expected, an integer c such that 
a 0 -f a t c + • • • + a n c n = 0 (mod m). 

In parallel with the remarks in Section 3-1 we have the following facts about 
the solutions of the congruence (***) and their connection with the solutions 
of the equation (*): 

3-6-1. Facts. (1) y = |_£_] m e Z w is a solution of the equation (*)<?> c e Z 
is a solution of the congruence (***). This is immediate because 

oc 0 + oqy + • • • + a n y n = 0 

| a 0 + a t c + • • + a„c" | , = l^J ra 
o a 0 + a x c -f • • • + a n c n = 0 (mod m). 

(2) If a{ = a t (mod m) for i = 1,..., «, then the solutions of the congru¬ 
ence a 0 -f a x x -f • • • + a n x n = 0 (mod m) are identical with the solutions of the 
congruence a 0 ' + a t 'x + • • • + a n 'x n = 0 (mod m). This is a consequence of 
(1), and it also follows immediately from the fact that for c e Z,a 0 + a 1 c + --- 
+ a n c n = a 0 ' + a^c + • • • + a n 'c n (mod m). 

(3) In solving the equation (*) via the congruence (***), the choice of the 
representatives a t for the various oq, / = 1,...,«, does not matter. This is 
clear from parts (1) and (2). 

(4) If c e Z is a solution of the congruence (***), then so is every ele¬ 
ment of |_£_| m —since if d = c (mod m), then a 0 + a t (d) + • • • + a n (d) n = 
a 0 + a t c + • • • + a n c n = 0 (mod m) [in fact, for any polynomial g(x) e Z[x], 
d = c (mod m) implies g(d) = g(c) (mod m)\. Because of this, one often views 
all the elements of a residue class, or the residue class itself, as a single solution 
of the congruence (***)—especially when one is interested in counting the 
solutions. Of course, these residue classes are precisely the solutions of the 
equation (*). 

With the preliminaries completed, let us turn to the problem at hand—ex¬ 
pressed in the language of congruences—namely, to solve any equation of 
form 

a 0 + a t x + • • • + a n x” = 0 (mod m). (***) 

The first case to consider is obviously when m is a prime p. In this situation, 
our problem amounts to finding the roots of the polynomial a 0 + a x x + • • • 
+ a n x” over the field Z p (the polynomial is really | cio L + 1 ai \ p x + • • • 
+ 1 #n \ D x n e Z p [x], but our use of the less cumbersome notation should cause 
no serious confusion). As seen in Section 3-5 we have some information about 
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the roots of this polynomial. However, we have no general recipe for finding 
the roots, other than testing each element of Z p , and will have to content 
ourselves with this state of affairs. Of course, in practice we shall deal with 
small primes p , in which case the roots can be located easily. 

As for the general case, where the modulus m in (***) is arbitrary, we 
shall see that solving it can be reduced to the case of a prime modulus p. As an 
initial step in this direction, let us examine a few specific examples where the 
modulus is a prime power p r . 

3-6-2. Examples. (/) Consider the congruence 

x 3 — x + 4 = 0 (mod 7 2 ). (#) 

Instead of solving it by testing each of the 49 possibilities 0, 1, ..., 48, we 
begin with the observation that if the integer c is a solution, then it is surely a 
solution of the congruence 

x 3 — x + 4 = 0 (mod 7). (# #) 

The basic idea then is to find the solutions of (# #) and then to search for the 
solutions of (#) among them. 

By trial and error one sees that (##) has exactly one solution, 3 (or 
better, |_3j 7 ). [We note in passing that the existence of a unique solution reflects 
the fact that, over the field Z 7 , the polynomial x 3 — x + 4 factors into 
(x — 3)(x 2 + 3x + 1), and x 2 -f 3x -f 1 is irreducible over Z 7 .] Consequently, 
if c e Z is a solution of ( #), then it must be an element of |_3j 7 = {3 + It 1/ e Z} 
—so we seek an integer (or integers) / for which 

c = 3 + It, t e Z 

is a solution of (#). This requires 

(3 + 7/) 3 - (3 + It) + 4 = 0 (mod 7 2 ) 

which becomes 

28 + 182/ + 441 f 2 + 343/ 3 = 0 (mod 7 2 ). 

Reducing the coefficients modulo 7 2 = 49, we have 

28 + 35/ = 0 (mod 7 2 ) 

which is equivalent to 

4 + 5/ = 0 (mod 7). 

This linear congruence has the unique solution |_2j 7 ; in other words, the set of 
all integer solutions is 


/ — 2 + 7w, 


we Z. 
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Therefore, 

c = 3 + 7(2 + 7w) = 17 + l 2 u 

for any and all we Z, and we conclude that [ 17 | 7 2 [or, as it is often written, 
17 (mod 49)] is the unique solution of (#). 

(w) Consider the congruence 

x 2 — x + 2 = 0 (mod 7 2 ). (4=) 

Following the above technique, if c e Z is a solution, then it must be a solution 
of 

x 2 — x + 2 = 0 (mod 7). (4= =1=) 

By trial-and-error, this has the unique solution 4—meaning |_4j 7 . [One 
normally expects the polynomial of second degree x 2 — x + 2 to have two 
roots over Z 7 ; but x 2 — x + 2 factors over Z 7 into (x — 4)(x — 4), so 4 is a 
double root.] Thus, c must be of form c = 4 + It and it satisfies 

(4 + It) 2 - (4 + It) + 2 = 0 (mod 7 2 ). 


This reduces to 

14 = 0 (mod l 2 ) 

which is impossible. The conclusion is: The congruence x 2 — x + 2 = 
0 (mod l 2 ) has no solutions. 


{in) To solve the equation 

f{x) = x 2 + x + 1 = 0 (mod l 2 ) (*) 

we look first at 


f{x) — x 2 + x+ l= 0 (mod 7). (**) 

This has the two solutions |_2_| 7 and |_4j 7 , corresponding to the fact that over 
Z 7 we have the factorization x 2 + x + 1 = (x — 2)(x — 4). Thus, if c e Z is a 
solution of (*), then it belongs to either [_2j 7 or |_4j 7 ; in other words, c has to 
be of form 


c — 2 + It or c — 4 + lt 

for some choice of the integer t. Substituting the case c = 2 + It in (*) leads 
to 


7 + 35/ + 49/ 2 = 0 (mod 7 2 ), 

which reduces to 


1 + 5/ = 0 (mod 7). 
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This has the unique solution [_4] 7 , so the integer t is of form 4 + 7w, u e Z, 
and consequently c = 30 + l 2 u. Thus, [ 30 | 4Q is a solution of (*). 

In similar fashion, substituting the case c = 4 + It in (*) yields 

3 + It = 0 (mod 7), 

which has the unique solution |_2j 7 = (2 + lu | u e Z}. It follows that [ 18 | 49 
is a solution of (*). 

The conclusion is: x 2 + x + 1 = 0 (mod 7 2 ) has the two solutions | 301 ? 2 ? 

Liilv 2 - 

(iv) Suppose we wish to solve 

x 2 + x + 1 = 0 (mod 7 3 ). (***) 

The same kind of approach is clearly valid. If c e Z is a solution, then it is 
surely a solution of 

x 2 + x + 1 = 0 (mod 7 2 ). 

As seen in part (iii), this congruence has the two solutions | 30 \ 7 2 and [ 18 | 7 2 9 
so c has to be of the form 

c = 30+ l 2 t or c=18 + 7 2 / 

for appropriate choices of t e Z. 

In the first case, we have then 

(30 + l 2 t) 2 + (30 + l 2 t) +1=0 (mod 7 3 ) 

which leads to 

931 + (61)(49)f = 0 (mod 7 3 ), 

19 + 6\t = 0 (mod 7), 

5 + 5t = 0 (mod 7). 

This has the unique solution [_6j 7 , and then (***) has the solution 

130 + 7 2 • 61 ?3 = 13241 ?3 . 

(This solution can also be written as 1 -19 | 7 3; it arises in this form when we 
take t = — 1 instead of t = 6.) 

In the second case, we have 

(18 + l 2 i) 2 + (18 + l 2 i) + 1 = 0(mod 7 3 ) 


which becomes 


343 + 37 • l 2 t + l*t 2 = 0 (mod 7 3 ) 
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and ends as 


2t = 0 (mod 7). 

The only solution here is [0j 7 , so for t = 0 we obtain c = 18 and the solution 
| 18 | 7 3 of (***). 

We have shown: The congruence x 2 + x + 1 = 0 (mod 7 3 ) has the two 
solutions 1 -19 | 7 a, | 18 | 7 3. 

(v) To solve 

x 2 — x — 12 = 0 (mod 7 2 ) 
we consider first the congruence 

x 2 — x — 12 = 0 (mod 7) 

which is the same as 

x 2 — x + 2 = 0 (mod 7). 

As noted in part (ii) this has the unique solution |_£j 7 . Therefore, any 
integer solution of the original congruence is of form c = 4 + It, and it 
satisfies 


(4 + It) 2 -(4 + 7/)-12 = 0 (mod 7 2 ). 


In other words, 


49f 2 + 49/ = 0 (mod l 2 ) 


so that 


Ot = 0 (mod 7), 

a congruence that is true for any choice of t. To put it another way, the validity 
of this congruence is independent of t , and it has the seven solutions t = 
0, 1,2, 3, 4, 5, 6 (mod 7). It follows that x 2 — x — 12 = 0 (mod l 2 ) has seven 
solutions—namely, [_4j 7 2 , [ H | 7 2 , |_j_8j 7 2 , 1 25 | 7 a > |_32] 7 2, |_39j 7 2, [ 461 7 2. 

From the preceding examples it would appear that the way to solve an 
arbitrary polynomial congruence modulo a prime power p r is to solve it first 
(mod p ), then use these solutions to solve the congruence (mod p 2 ), and 
proceed upward in this way one step at a time through the congruences 
modulo increasing powers of p until one reaches p r . Indeed, this is the way 
things go, but before analyzing and simplifying the passage from the solutions 
(mod p s ) to the solutions (mod p s + 1 ), it is necessary to make a preliminary 
comment. 
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3-6-3. Remark. Given any polynomial with integer coefficients 

n 

fix) = a 0 + a t x + • • • + a„x B = £ a t x l e Z[x], 

i = 0 

we associate with it another polynomial with integer coefficients 

n 

f\x) — a x -F la 2 x 4 - * • * + n^x" -1 = £ i^x* -1 e Z[x], 

i = 1 

and call /'(*) the derivative of/(x). For example, if/(x) = 2 + 3x — x 2 -F 5x 3 
-F x 6 — 2x 7 , then/'(x) = 3 — 2x -F 15x 2 + 6x 5 — 14x 6 . 

The reader who is familiar with calculus may verify that this “formal 
derivative ” satisfies the usual properties of a derivative (as indicated in 3-3-15, 
Problem 24), but this is irrelevant here. The application of the derivative which 
we require arises as follows. For any he Z consider the polynomial 

/(x + h) = a 0 + a t (x + h) + a 2 (x + h) 2 + • • • + a n {x + h) n e Z[x]. 

If we expand, and combine all terms in which h does not appear, the result is 

a 0 + a t x + • • • + a n x n which equals /(x). 

Combining all the terms of the expansion in which h appears to the first power, 
we have 

h(a i + 2a 2 x + 3 a 3 x 2 + • • • + na„x n ~ i ) which equals hf'(x). 

The remaining terms all include h to a power greater than or equal to 2; so 
their sum may be put in the form 

h 2 • (some element of Z[x]). 

Thus, we have 

/(x + h)= /(x) + hf\x) + h 2 • (an element of Z[x]) 
and for any ce Z this yields 

f(c + h) =/(c) + hf'(c) + h 2 • (an integer). 

We are now in a position to analyze successfully the connections between 
the solutions of a congruence modulo a prime power p s and its solutions 
modulo the next highest prime power p s+i . 

3-6-4. Proposition. Given a polynomial /(x) = a 0 + a t x + • • • + a n y? e 
Z[x], a prime p , and an integer s > 1, we have: 

(/) If c e Z is a solution of the congruence 

f(x) = 0 (mod/> s+1 ), 
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then c is a solution of 

f(x) = 0 (mod p l ) 

for each i = 1,2 , 

(») If c s e Z is a solution of the congruence 
f(x) = 0 (mod p s ) 


and we put 


C s+ i = C S + tp s , t 6 Z, 


then c s+1 is a solution of the congruence 

fix) = 0 (mod p s+1 ) 
if and only if the integer t satisfies the relation 


(*) 


(**) 


(mod p) (#) 

(Hi) IfLfdp* is a solution of (*), then the number of distinct solutions 
1 c s +1 | p s-nof(**) which arise from it [that is, for which c x+i = c s (mod/? s )]is 

f(c ) 

0 when f'(c s ) = 0 (mod p ) ana-p- ^ 0 (mod p), 

1 when/'(c s ) # 0 (mod p), 

f(c ) 

p when f'(c s ) = 0 (mod p) and —j- = 0 (mod p). 


Proof: (/) This is trivial;/(c) = 0 (mod /? s+1 ) implies /(c) = 0 (mod p l ) for 
every i < s + 1. 

(«) By hypothesis, /(c s ) = 0 (mod p s ) 9 so p s |/(c s ) and /(c s )/p s , which ap¬ 
pears on the right side of (#), is indeed an integer. Then 

c s+ 1 = c s + tp s is a solution of (**) 

O f(c s + tp s ) - 0(mod p s+1 ) 

O f(c s ) + (tp s )f’(c s ) + (tp s f -(aninteger) s 0(modp s+1 ) 
o f(c s ) + tp s f'(c s ) = 0 (mod p s+ ‘) (as 2s > s + 1) 

o + */'(«.)) = 0 (mod p> +1 ) 

o f -^- + tf’(c s ) = 0(mod p) 

P 

o f(c s )t = (mod p), 

P 


so ( 1 ii ) is proved. 
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( iii ) If c s (more precisely, [cs _| pS ) is a solution of f(x) = 0 (mod/), then a 
solution c s+1 (more precisely, | c s + 1 | n ^ t ) of /(x) = 0 (mod/ +1 ) is said to 
“ arise from c s (or, from |_£sJ pS ) ” when c s+i is of form c s -f tp s for some t e Z— 
or, equivalently, when c s+i = c s (mod p s ). Thus, the phrase “arise from” is 
meant to indicate the passage [described in part (ii)] from a solution of (*) to 
a solution of (**). Our purpose here is to see how many solutions of (**) we 
obtain in this way. Of course, because counting is involved, it is understood 
that we are dealing with residue classes as solutions. 

Now, if both 1 c s + tip s I p..! and 1 c s +t 2 P s | „ y -n are solutions of (**) which 
arise from \_£s] p s , then clearly 

\c s + + , = K + tz/L + i o t t st 2 (mod p) 

—so they are distinct solutions if and only if t 1 ^ t 2 (mod p). Therefore, in 
virtue of part («), the number of distinct solutions of f{x) = 0 (mod p s+1 ) 
arising from [cs] p s is precisely the number of distinct solutions of the con¬ 
gruence 

/'(c s )f= -^jjr (mod p). (#) 

But this is a linear congruence modulo the prime p , and the theory of a linear 
congruence was settled completely in Section 3-1. Accordingly, it is easy to 
see that (#) has either 0, 1, or p solutions, and the conditions under which 
these three possibilities occur are those stated in (iii). This completes the 
proof. | 

3-6-5. Remark. Basing ourselves on this result, let us summarize the 
method for solving a polynomial congruence 

f(x) = 0 (mod/). 

We start by solving 

f(x) = 0 (mod p) 

by trial and error, as a rule. In keeping with the notation of 3-6-4, denote a 
typical such solution by q. The next step involves solving 

f(x) = 0 (mod p 2 ). 

To find a typical solution c 2 = c i + tp of this congruence, we solve the linear 
congruence 

f'(ci)t= — (mod p) 

P 
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[which is the congruence (#) in the case s = 1] for t. Of course, this must be 
done for each choice of q. 

In the same way, c 2 is used to find a typical solution c 3 = c 2 + tp 2 of the 
congruence 

f(x ) = 0 (mod p 3 ) 

—where t must satisfy 

fXc 2 )t^-^r ( mod p)- 

P 

This procedure is repeated to solve f(x) = 0 modulo p 4 ,p 5 , At each 

stage q leads either 0, 1, or p choices for c i+1 . We finally have solutions c r of 
the original congruence. Incidentally, it is clear that 

c 2 = Ci (mod p), c 3 = c 2 (mod p 2 ), ...,c s+l = c s (mod p s ),... 

—so if we start from a specific q and finally have with c r , then 

q = c 2 = • • • = c r (mod p). 

How do we know that we have located all solutions of f(x) = 0 (mod p r ) 
in this way? This is not hard. Suppose c (more precisely, [f_| p r) is any solution 
of f(x) = 0 (mod p r ); we must show that c occurs as the end result of some 
chain of solutions c u c 2 , ..., c r = c. Now, c (and Lf_| p ) is a solution of f(x ) = 
0 (mod p ), so c may be chosen as c v To find c 2 when c t = c , we solve 

f\c)t= -^(mod p). 

P 

Since p r |/(c), the right-hand side is congruent to 0 (mod p ), so one solution 
for tis t = 0, and we then obtain c 2 = c. Proceeding in this fashion (and using 
the fact that f{c)jp s = 0 (mod p) for every s < r) we have a chain of solutions 

C\ ^2 ^3 ••• 9 Cs C 9 ... , C r C. 

This does it. 

It is of some interest to distinguish one situation in which the method 
described above goes very smoothly. 

3-6-6. Corollary. Suppose c 1 is a solution , of f(x) = 0 (mod p) with 
/'(q) ^ 0 (mod p); then the congruence f(x) = 0 (mod p r ) has a solution 
1 c r | pr which is the uniques olution arising from |_£i_| p via a chain of solu¬ 
tions c u c 2 ,..., c r . 
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Proof : It suffices to guarantee that a chain of solutions c l9 c l9 ... 9 c r 
exists, and that the chain |_£i_| p , |_£ 2 _| p 2 ,..., 1 cr \ „ r is unique. This is done 
inductively. 

Starting from c 1 , we find c 2 = a -f tp by solving 

(mod p ). 

P 

Because /'(a) # 0(mod p ), 3-6-4, part (Hi) tells us that a solution [ c 2 | p 2 arising 
from | c i | p exists and is unique. Next, we must locate c 3 = c 2 -f f/? 2 by solving 

f\c 2 )t = (mod p). 

P 

Since a = c 2 (mod /?), we have /'(a) = /'(a) (mod /?), so this congruence may 
be put in the form 

= ~^r ( mod p)- 

Hence, a solution [ a | p3 arising from [ c 2 | B » exists and is unique. 

This state of affairs recurs as we move along the chain; since c s = c t (mod p) 
and /'(a) = /'(A)( m °d P ) f° r eac ^ ^ < r, to find c s+1 = c s + tp s from c s we 
need to solve 

f(c ) 

/'(a)* = -f(mod p) (# #) 

P 

—so, clearly, 1 c s + 1 \ nS+1 arising from [_£s_| pS exists and is unique. Thus, [ c r | pr 
exists and is unique. The details are left to the reader. | 

Note: In treating numerical examples with /'(a) #0(mod p ), the 
foregoing says, in particular, that at each stage we may solve 

fic^t = -^r ( mod P)> s = 1, 2, ..., r — 1 

instead of the customary congruence 

f'(c s )t = (mod p)- 

This reduces somewhat the work involved in arriving at c r . 

3-6-7. Examples. Let us illustrate the general technique discussed in 3-6-4, 
3-6-5, 3-6-6 by using it to solve the congruences that were examined in 3-6-2. 

(/) To solve 

f(x) = x 3 — x + 4 = 0 (mod 7 2 ), (p = 7) 
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we note first, as before, that c t = 3 is a solution of 

/(x) = x 3 — x -f 4 = 0 (mod 7). 

In fact, |_3J 7 is the only solution. 

Then /'(x) = 3x 2 — 1, so the equation 

)1 (mod p) 

P 

used to find c 2 = c t + tp = 3 + It becomes [since /(3) = 28, and /'(3) = 26] 

26r = -4 (mod 7). 

Because /'(^i) = 26 ^ 0 (mod 7), there is a unique solution | c 2 [ 7 2 . In fact, 
solving 5t = — 4 (mod 7) [which is a simpler version of 26 1 = — 4 (mod 7)] 

yields t = 2, so c 2 = 3 + 7 • 2 = 17; thus 1 17 | ?2 is the unique solution of the 

original congruence. 

(ii) To solve 

f(x) = x 2 — x + 2 = 0 (mod 7 2 ), (p = 7) 

we observe that 

f(x) = x 2 — x + 2 = 0 (mod 7) 

has the unique solution |_fjJ 7 = |_4j 7 . Then f'(x) = 2x — 1 = /(4) = 14, 

/'(a) = /'(4) = 7, and the equation 

/'(Cl)* = (mod p) 

P 

via which we find c 2 = c t + tp = 4 + It takes the form 

It = -2 (mod 7) or Or = — 2 (mod 7). 

This has no solution, so c 2 does not exist—or equivalently, according to the 
criterion of 3-6-4, part (w7), there are no solutions | c 2 | 7 2 arising from [ c 1 | 7 . 
Thus the original congruence has no solution. 

(iii) To solve 

f(x) = x 2 + x + 1 = 0 (mod 7 2 ) 
we start from the two solutions 


of 


Ci = 2, c t = 4 


/(x) = x 2 + x + 1 = 0 (mod 7). 


Each of these solutions must be pursued to the next stage. 
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If c x = 2, then /(q) =/(2) = 7 and /'(^i) = /'(2) = 5, so that c 2 = 
c t + tp = 2 + It must satisfy 

5 1= (mod 7). 

Now 5/ = — 1 (mod 7) has the solution t = 4, so c 2 = 2 + 7 • 4 = 30 is a 
solution of the original congruence. According to 3-6-4, part (///), since 
/'(2) ^ 0 (mod 7), [_ 30 j 7 2 i s the unique solution of the original which arises 
from [_£i_| 7 = |_2J 7 • 

If c t - 4, then f{ci) =/(4) = 21 and/'(c x ) =/'(4) = 9 so c 2 = c x + tp = 
4 A-It has to satisfy 

9 1 = — V (mod 7) or 2t= —3 (mod 7). 

Taking t = 2 we obtain c 2 = 18—and | 18 | ?2 is the unique solution of the 
original arising from [ | 7 = [ 4 | 7 . 

Thus, x 2 + x + 1 = 0 (mod 49) has exactly two solutions: [ 30 | 7 a, | 18 | 7 2 . 
(iv) To solve 

f(x) = x 2 + x + 1 = 0 (mod 7 3 ) 

we need to continue the two chains of solutions 

c t = 2, c 2 = 30 c t =4, c 2 — 18 

found above one step further. Recalling from above that /'(^i) # 0 (mod p) 
(p = 7), and making use of 3-6-6, we see that c 3 = c 2 + tp 2 may be obtained, 
in each case, by solving 

/'(Ci)< = (modjj). 

In the case, c t = 2, c 2 = 30 we have 

/'(2)t=-®- } (mod 7), 


which becomes 


5t = — 


931 

’49“ 


-19 = 2 (mod 7). 


Taking t = 6 gives c 3 = 324; so | 3241 7 3 is the unique solution of f(x) = 
0 (mod 7 3 ) which arises from | | 7 = | 21 7 . 

In the case, c t = 4, c 2 = 18 we have 


9t = — 


343 

"49 


(mod 7) or 2t = 0 (mod 7). 
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Taking* = 0givesc 3 = 18; so 1 18 | ?3 is the unique solution of/(x) = 0(mod7 3 ) 
which arises from | c 1 1 7 = | 4 | 7 . 

Thus, x 2 + a: + 1 = 0(mod 7 3 ) has exactly two solutions: | 3241 ?3 , | 18 | 7 a. 

(i?) It may be safely left for the reader to solve 

f(x) = x 2 — x — 12 = 0 (mod 7 2 ) 
and find its seven solutions. 

3-6-8. Remark. Having seen how the problem of solving a polynomial 
congruence modulo any prime power p r reduces essentially to that of solving 
the congruence modulo the prime p , let us now show how the problem of 
solving a polynomial congruence 

f(x) = 0 (mod m\ (*) 

where m is arbitrary, reduces to solving the congruence modulo certain prime 
powers. 

Suppose the prime-power factorization of m is m = p r ip r 2 • * • p r „ n and for 
convenience let us put m i = p r \ m 2 = p r 2 ,..., m n = p r n n . Thus, m = m 1 --m n 
and, because m u m 2 ,..., m n are relatively prime in pairs, [m i9 m 2 ,..., m n ] 
= w 1 m 2 • • • m n . Consider the system of congruences 

f(x) = 0 (mod m t ) 9 
f(x) = 0 (mod m 2 ), 

f(x) = 0 (mod m t ) 9 . ^ ^ 

f(x) = 0 (mod m n ). 

We say that an integer c is a solution of this system when it is a solution of 
each of the congruences—that is, when /(c) = 0 (mod m t ) for / = 1,2 
Clearly, the integer c is a solution of the congruence (+) if and only if it is a 
solution of the system (4= 4=) [since m divides /(c) divides /(c) for 

i = 1,. • • > w]. 

Now, we can find solutions (if any exist) of (4= +) as follows. For each 
i = 1,..., n we solve the i*th congruence f(x) = 0 (mod m ( ) by the procedure 
of 3-6-5. Suppose c (1) , c (2) ,..., c (n) are solutions of the individual congruences 
—more precisely, c (,) is a solution of f(x) = 0 (mod m t ) for each i= 1,2,...,«. 
By the Chinese remainder theorem, we can find an integer c satisfying 

c = c (0 (mod w,-), i = 1, 2,..., n. 

Then for each i = 1, 2,..., n 

/(c) =/(c (i) ) s 0 (mod w,) 
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so c is a solution of the system (+ +) and of the original congruence f{x) = 
0 (mod m). 

How many solutions are there ? This involves counting elements of residue 
class rings. Let S m denote the set of elements of Z w which are solutions of 
f(x) =0 (mod m); so #(S m ) is the number of solutions of f(x) = 0 (mod m) — 
let us write this as #(m). Similarly, for each /, S Mi is the set of elements of 
Z w . which are solutions of f{x) = 0 (mod m ( ) —so #(m f ) = #(S m ) is the 
number of solutions of f(x) = 0 (mod m f ). We produce a 1-1 correspondence 
between S m and the product set S m x S m2 x • • • x S mn . If [ c\ m e S m , then, for 
each /, is a solution of the /th congruence—that is, e S mr Thus, 

l£j--*(l£j-.*l£j« 2 — ’l£jj 


is a mapping of S m S mi x S m2 x • • • x S mn . On the other hand, an element of 
the product set is of form 


(L 


.(Dl 


r>( 2 )| 


r (n) 


Jm„)’ 


where for each /, c (i) is a solution of the /th congruence. Then for an integer 
c satisfying c = c* (0 (mod m f ), / = 1, ..., n (whose existence is guaranteed by 
the Chinese remainder theorem), we have 


(| C L> lfj« 2 ’ * * * 5 Ifjm n) - (\- 


( 1)1 


r*(2)| 


^(«)i 


J 


and (again by the Chinese remainder theorem) \ c_\ m is uniquely determined by 
the given element of the product set. Thus, 

is a mapping of S mi x S m2 x • • • x S mn ->S m . 

It is easy to see that the two mappings just defined are inverses of each 
other, and provide a one-to-one correspondence between the sets S m and 
S mi x S m2 x • • • x S Wll . In particular, 

#0 Sm) — X Sm 2 X * * * X *^m„) 

and 

#(m) = n #(m £ ). 

i= 1 


Observing that in the derivation of this formula the form of the m/s does not 
enter—only the fact that they are relatively prime in pairs—we can summarize 
the discussion as follows. 


3-6-9. Proposition. Suppose m = m 1 w 2 •••/«„ where m u ..., m n are 
relatively prime in pairs. Then we can find the solutions of 

f(x) = 0 (mod m) 
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from the solutions of the individual congruences of the system 

f(x) = 0 (mod m t ), 
fix) = 0 (mod m 2 ), 

fix) = 0 (mod m„), 

via the Chinese remainder theorem. The number of solutions of fix) = 
0 (mod m) is given by 

#(m) = n #(w f ). 

i = 1 

In particular, #(m) = 0 o some #(m f ) = 0; in other words, /(x) = 
0 (mod m) has no solution o one (or more) of the congruences /(x) = 
0 (mod m t ) has no solution. 


3-6-10. Example. We solve 

/(x) = x 3 + 90x 2 + 83x + 99 = 0 (mod 3 2 • 5 2 • 7). 

As per 3-6-8 and 3-6-9, we first solve each of the congruences 

fix) - 0 (mod 3 2 ), 
fix) = 0 (mod 5 2 ), 
fix) = 0 (mod 7). 

These are, respectively, the congruences 

(I) x 3 + 2x = 0 (mod 3 2 ), 

(II) x 3 + 15x 2 + 8x — 1 ee 0 (mod 5 2 ), 

(III) x 3 — x 2 — x + 1 = 0 (mod 7). 

To solve (I), we start with 

x 3 + 2x = 0 (mod 3), 

which has the three solutions 0, 1, 2 (mod 3); in fact, in Z 3 [x] we have 
x 3 + 2x = x(x — l)(x — 2). 

By the method of 3-6-4 and 3-6-5 these solutions lead, respectively, to the 
solutions 0, 4, 5 of (I). Thus, x 3 + 2x = 0(mod 3 2 ) has exactly three solutions: 

Lgja* > | 4 |a 2 » | 5 U 2 • 

To solve (II), we start with 

x 3 + 15x 2 + 8x — 1 = 0 (mod 5), 
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which is x 3 -f 3x — 1 = 0 (mod 5). This has the two solutions 3,4 (mod 5); in 
fact, 3 is a double solution (or root), because in Z 5 [x], we have 

x 3 + 15x 2 + 8x — 1 = x 3 + 3x — 1 = (x — 3) 2 (x — 4). 

By the method of 3-6-4 and 3-6-5, there is no solution of (II) arising from the 
solution 3. On the other hand, the solution 4 leads, by this method, to the 
solution 19 of (II). Thus, x 3 + 15x 2 + 8x — 1 = 0 (mod 5 2 ) has exactly one 
solution: [ 19 | S 2 . 

To solve (III) is easy. It has the two solutions 1, 6 (mod 7); in fact, in 
Z 7 [x] we have 

x 3 — x 2 — x + 1 = (x — l) 2 (x — 6). 

Thus, x 3 — x 2 — x -f 1 = 0 (mod 7) has the two solutions: [_1_| 7 , [6j 7 = | -1 | 7 . 

Returning to the original congruence, it has 3 • 1 • 2 = 6 solutions. They 
are to be found, by the Chinese remainder theorem, from the six congruences 
of form 

x = a t (mod 9), 
x = a 2 (mod 25), 
x = a 3 (mod 7), 

where a t = 0, 4, or 5, a 2 = 19, a 3 = 1 or — 1. The six solutions turn out to be 


(4I9J 

1575 ’ |594| 

1575 5 [769] 

1575 




3-6-11 / PROBLEMS 

1. Use the elementary technique illustrated in 3-6-2 to solve: 

(/) x 3 — x + 4 = 0 (mod 7 3 ), 

(ii) x 3 — x + 4 = 0 (mod 7 4 ), 

(iii) x 2 + x + 1 = 0 (mod 7 4 ), 

(iv) x 2 —x — 12 = 0 (mod 7 3 ). 

2. Do each part of Problem 1 by the general method. 

3. By solving/(x) = x 2 — 5x — 15 = 0 (mod 3 3 ) show that it has exactly two 
solutions: 1 15 | 27 , | 17 | 27 . 

4. Let fix) be any one of the polynomials 

(/) x 3 — x + 3, (i?) x 3 + 2x — 5, 

(ii) x 3 — 2x + 3, (vi) x 3 — 2x 2 + 1, 

(iii) x 3 + 2x — 3, ( vii) x 2 + x + 7, 

(iv) x 3 + x 2 — 4, (viii) x 4 + x + 1. 

Solve the congruence f(x) = 0 (mod m) for each of the following choices 
of m: ‘ 

(a) 3, (b) 9, (c) 27, (d) 81. 
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5. Do Problem 4 for m equal to 

(a) 2, (b) 4, (c) 8, (d) 16, (e) 32, (/) 64. 

6. Do Problem 4 for the following values of m : 

(a) 5, (b) 25, (c) 125. 

7. Do Problem 4 for ra equal to 

(a) 7, (6) 49, (c) 343. 

8. Do Problem 4 when ra equals 

(a) 15, (6) 45, (c) 75, (rf) 135. 

9. Do Problem 4 when m equals 

(a) 21, (6) 63, (c) 147, (d) 189. 

10. Do Problem 4 for ra equal to 

(a) 35, (6) 175. 

11. Do Problem 4 when m is 

(a) 210, (6) 280, (c) 336, (rf) 490. 

12. Consider f(x) = x 3 — 4x 2 — llx + 30 viewed as a polynomial in Z m [x] 
when m equals 

(0 419, (//) 463, (iii) 91, (iv) 143. 

In each case, find all the roots in Z w . (Incidentally, 419 and 463 are prime.) 

13. Without actually finding the solutions, decide how many solutions there 
are to each of the following congruences. 

(/) x 3 — x + 1 = 0 (mod 3 10 • 13 5 ), 

(k) x 3 — x + 1 = 0 (mod 5 7 • 7 5 ), 

(iii) x 3 — x + 1 = 0 (mod 7 8 • 11 4 ), 

(iv) x 3 + 5x — 3 = 0 (mod 3 10 • 5 5 ), 

(v) x 3 + 2x — 3 = 0 (mod 3 5 • 5 2 ). 

14. Construct, if possible, a polynomial of degree 3 for which the number of 
solutions of/(x) = 0 (mod 5 2 ) is: 

(0 0, (ii) 1, (iii) 5, (iv) 2, (v) 3, (vi) 4. 

Which additional integers can represent the number of solutions ? 

15. If f(x) is of degree 3 and p is prime, what are the possibilities for the num¬ 
ber of solutions of /(x) = 0 (mod p 2 ). 

16. Suppose q is a solution of /(x) = 0 (mod p) with/'(q) # 0 (mod p). Prove 
that if c is a solution of/(x) = 0(mod p r ) with c = q (mod p ), then [cj pr is 
the solution of /(x) = 0 (mod p r ) arising uniquely from |_£j_| p . 
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17. Suppose f{x) e Z[x] is a polynomial of degree n. For any he Z, show that 

f(x + h) = /O) + f'(x)h + ^-f"(x)h 2 + • • • + -^ f n \x)h n 

2! n\ 

where f\x) is the derivative of /'(*) and, recursively, f {i \x) is the deriva¬ 
tive of f (i ~ 1 \x). Furthermore, each term f^\x)h l li\ belongs to Z[x]. 

The reader with calculus experience should recognize this as a form of 
Taylor’s theorem. 

3-7. Quadratic Reciprocity 

In the last section, we studied the problem of solving a polynomial 
congruence 

f(x) = a 0 + a t x + • • • + a n y? = 0 (mod m ), a 0 , a u ..., a n e Z. 
Equivalently, this is the problem of finding the roots in Z m of the polynomial 
fix) = a 0 + a l x+--- + a n tf'e Z m [x ]. 

(Strictly speaking, this latter polynomial should be written as 

Hm + Hm* + • • • + Nm*" 6 Z 

but we prefer to let an integer represent itself or its residue class, depending on 
the context. This has been done many times, so the danger of confusion should 
not be great.) It was seen that this reduces to the problem of solving the 
congruence modulo a prime. More precisely, once we know how to find all 
solutions of 


fix) = a 0 + a t x + • • • + a n x* = 0 (mod p) 

when p is prime then, by the recipe of 3-6-5, we can find all solutions of 

fix) = a 0 + a t x + • • • + a n x? = 0 (mod p r ) 

for any choice of r > 1. Once this has been done for every prime power p r 
which appears in the prime factorization of m , we can make use of the Chinese 
remainder theorem, as in 3-6-8, to obtain all the solutions of the original 
congruence mod m . 

Our concern turns, therefore, to solving the congruence 

fix) = a 0 + a t x + • • • + a n x" = 0 (mod p ), p prime 

or, equivalently, to finding the roots in the field Z p of the polynomial 
fix) = + ^ + + Z p [x], 
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Unfortunately, this is too hard in general (though for reasonably small p , it is 
feasible to simply test every element of Z p and thus find all the roots). This is 
not surprising; after all, in the familiar situation of polynomials with real 
coefficients we have no general method which always locates the real roots. 

Consequently, just as is done in the study of polynomials with real co¬ 
efficients, it is natural to restrict the degree n of fix). When n— 1, we are 
dealing with a linear congruence 

a 0 -f a x x = 0 (mod p) 

and, in virtue of our work in Section 3-1, we know everything about solving 
it. When n — 2, which means that the polynomial is “ quadratic,” we are 
dealing (after a trivial change of notation) with a congruence of form 

fix) = ax 2 + bx + c = 0 (mod p) (*) 

or, equivalently, with the polynomial 

fix) = ax 2 + bx + c e Z p [x]. (**) 

Again, we emphasize that a , b , c are viewed, interchangeably, as integers or as 
elements of Z p . In particular, since fix) e Z p [x] has degree 2, a # 0 e Z p ; or 
what is the same thing p)( a, when a is viewed in Z. 

If/? = 2, the only possible roots of (**) [or solutions of (*)] are 0 and 1—so 
it is trivial to find the roots. Therefore, throughout this section, we shall 
always assume that 


p is an odd prime. 

3-7-1. Remark. As a first step, let us transform and simplify the problem 
of finding the roots of 


fix) = ax 2 + bx + c e Z p [x], (*) 

The idea is to use the standard technique of “ completing the square,” as was 
done in 3-5-22. Note first that a ^ 0 in Z p , by hypothesis, and 4 # 0 in Z p , 
because p is an odd prime. Therefore, 4a # 0 in the field Z p . Now, for x 0 e Z pi 
we have 


x 0 is a root of (*) <=> ax\ + bx 0 + c = 0 

<=> 4a(axl + bx 0 ) = — 4 ac 
o 4a 2 xl + 4abx 0 + b 2 = b 2 — 4ac 
o (2 ax 0 + b) 2 = b 2 — 4ac. 


At this point, if we were dealing with real or complex numbers, the next step 
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would be to take square roots and finally obtain 

— b ± Jb 2 — 4ac 
Xo= - 2a - 

—but in Z p , the meaning of 66 +V& 2 — 4 ac" is far from clear. Therefore, in 
order to be careful, let us consider the polynomial in a new indeterminate y, 

y 2 — (b 2 — 4ac) e Z p [y] (#) 

or, what is the same thing, the equation 

y 2 —b 2 — 4 ac 

over Z p . If x 0 is a root of (*), then obviously y 0 = 2ax 0 -f b e Z p is a root of 
(#). Conversely, suppose y 0 e Z p is a root of (#). Then, because 2a # 0 in 
Z p , the linear equation (over Z p ) 

2 ax + b = y 0 

has a unique solution x 0 e Z p . In fact, this solution is 

X 0 = (2ay 1 (y 0 - b) 

where {2a)~ 1 denotes the multiplicative inverse of 2a in the field Z p . Thus, 
(2 ax + 0 £) 2 = To 2 = b 2 — 4ac, and from our equivalences, x 0 is a root of (*). 
We have proved: 

(*) has the root x 0 o (#) has the root j> 0 = 2ax 0 + b. 

All this means that the problem of finding the roots of any quadratic poly¬ 
nomial ax 2 + k + ce Z p [x] reduces to the problem of finding the roots of a 
simpler quadratic polynomial y 2 — ( b 2 — 4 ac). Therefore, with a trivial 
change of notation, it suffices for us to consider the problem of finding the 
roots of polynomials of form 

x 2 — a e Z p [x] (a ^ 0 in Z p ). 

(Any root is then entitled to be called a “ square root of a ” in Z p , and a may 
be said to be a “square” in Z p . Of course, there are at most two distinct 
square roots of a, because x 2 — a is a polynomial of degree 2 over the field 
Z p .) Equivalently, our problem may be restated as that of solving the con¬ 
gruence 

x 2 = a (mod p), {a e Z, pj(a). 

3-7-2. Definition. Suppose p is an odd prime and a is an integer with 
pj( a. We say that a is a quadratic residue (mod p) when the congruence x 2 = 
a (mod p) has a solution; if the congruence has no solution, a is called a 
quadratic nonresidue (mod /?). 
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It is extremely convenient to introduce a symbol or notation due to 
Legendre (1752-1833)—namely, 

+ 1, when a is a quadratic residue (mod p ), 

— 1, when a is a quadratic nonresidue (mod p ). 

One reads (^) as “the Legendre symbol of a over p.” 

There are several equivalent formulations for these definitions; in fact, 

= 1 o a is a quadratic residue (mod p) 
o x 2 = a (mod p) has a solution 
<=> the polynomial x 2 — a e Z p [x] has a root in Z p 
<=> a (more precisely, is a square in Z p 

and 

= — 1 <=> a is a quadratic nonresidue (mod p) 
o x 2 = a (mod p) has no solution 
o the polynomial x 2 — a e Z p [x] has no root in Z p 
o a (more precisely, is not a square in Z p . 





3-7-3. Examples. (1) Let p = 7. The congruence x 2 = 2 (mod 7) has a 
solution—in fact, 3 is a solution, and so is 4—so (y) = 1. On the other hand, 
(y) = — 1 because the congruence x 2 = 3 (mod 7) has no solution, or equiva¬ 
lently because the polynomial X* — 3 e Z 7 [x] has no root in Z 7 . In similar 
fashion, (|) = 1, (f) = 1, (|) = - 1, (y) = -1, (VO = -1, (V) = “ h and so 
on. 

(2) Let p = 11. In the field Z u , we have 

1 2 = 1, 2 2 = 4, 3 2 = 9, 4 2 = 5, 5 2 = 3, 

10 2 = 1, 9 2 = 4, 8 2 = 9, 7 2 = 5, 6 2 = 3. 

Therefore, 1, 3, 4, 5, 9 are quadratic residues (mod 11), and 2, 6, 7, 8, 10 are 
quadratic nonresidues (mod 11). In other words, ( T y) =1 for a — 1, 3, 4, 5, 9 
and ( T y) = — 1 for 2, 6, 7, 8, 10. 

What about (fy)? In Z n , — 1 = 10, which is not a square in Z n —so 
jc 2 + le Z 1]L |X1 has no root in Z n , and (fy) = — 1. 

Similarly, in Z n , 9 = — 2 = 20 — —13 • • • so they all have the same 
Legendre symbol—that is, (ff) = (ff) = (Vr) = (it) = !• 



344 


III. CONGRUENCES AND POLYNOMIALS 


(3) Consider p = 13. Then ( 3 %) =1 for a = 1, 3, 4, 9, 10, 12 and ( 3 %) = 
— 1 for a = 2, 5, 6 , 7, 8 , 11—since, in Z 13 , 

1 2 = 1, 2 2 = 4, 3 2 = 9, 4 2 = 3, 5 2 = 12, 6 2 = 10, 

12 2 = 1, 11 2 = 4, 10 2 = 9, 9 2 = 3, 8 2 = 12, 7 2 = 10. 

Thus, for example, x 2 = 10 (mod 13) has a solution, and x 2 = 8 (mod 13) does 
not have a solution. 

Note that for any c = 1,2,..., 12, viewed as an element of Z 13 , we have 
c 2 = (~c) 2 = (13 — c) 2 in Z 13 . 

Therefore, if c e Z is a solution of x 2 = a (mod 13), then so is 13 — c. Even 
more, these facts carry over to any p —namely, since 

c 2 = (-c) 2 = (p-c) 2 in Z p 

or, equivalently (when we work in Z) since 

c 2 = (-c) 2 = (p- c) 2 (mod p), 

it follows that if c e Z p is a root of x 2 — a e Z p [x], then so is p — c = — c in 
Z p , or equivalently, (when we work in Z) if c e Z is a solution of x 2 = 
a (mod /?), then so is p — c. 

Our objective is to compute the Legendre symbol (~) in all meaningful 
cases, and thus to settle the question of when the congruence x 2 = a (mod p) 
has a solution (or, equivalently to determine which elements of any finite 
field Z p are squares). We begin with some elementary properties of the 
Legendre symbol. 


3-7-4. Proposition. Suppose a and b are integers which are relatively prime 
to the odd prime p; then 
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Proof : (/) The congruence x 2 = a 2 (mod p) obviously has a solution— a 
itself will do—so (j) = 1 ; and then (£) = (y) = 1 . 

(ii) Since a = b (mod p),ce Z is a solution of x 2 = a (mod p) if and only if 
it is a solution of x 2 = b (mod p). Thus, there are only two possibilities: either 
both x 2 = a (mod p) and x 2 = b (mod p) have a solution, or neither one has a 
solution. It follows that (^) = (^). 

(Hi) If (|) = 1, then there exists x 0 e Z with Xq = a (mod p). Raising both 
sides of the congruence to the power (p — l)/ 2 , gives 

a (p “ i )/ 2 = (x 0 2 ) (p_1)/2 = xg - 1 = 1 (mod p). 

[The congruence xg -1 = 1 (mod p) is valid in virtue of 3-2-22, part (7)—or by 
making use of 3-5-12.] Thus: If (^) = 1, then (^) = a (p ~ 1)/2 (modp). 

On the other hand, suppose (j) = — 1, sox 2 = a (mod p) has no solutions. 
By the theory of linear congruences, for each r e (1, 2, ...,p — 1} there exists 
an r' e (1, 2 ,... ,p — 1 } with 

rr' = a (mod p) 


[that is, r' is a solution of rx = a (mod p)]. Moreover, r' # r ; because if r' = r, 
then r 2 = a (mod p), which cannot be since x 2 = a (mod p) has no solutions. 
Thus, the set {1, 2,..., p — 1}, with p — 1 elements, breaks up into pairs 
(r, r'}, and there are (p — l)/2 such pairs. By Wilson’s theorem,3-5-13, we have 

— 1 = (p — 1 )! = n ( rr') = a (p - 1)f2 (mod p). 

all pairs 

Thus: If (^) = — 1, then (|) = a (p “ 1 )/ 2 (modp). 

The proof of our result, with which the name Euler is usually associated, is 
now complete. 

(iv) In virtue of (///), we have 


V 


= a ip - i)f2 b ip - i)/2 =(ab) (p - 1)/2 



(mod p). 


Now (f)0 is +1 or - 1 , and so is (f); if (|)(|) ^ (f), we have 1 = -1 
(mod p) or 2 = 0 (mod p)—which is impossible because p is an odd prime. 

H =”“. ©(;) = (?)■ 

(v ) According to (iii), (-^) = ( — l) (p 1)/2 (modp). Again, because each side 
is +1 or — 1 , it follows that they are equal. | 

It is customary to rephrase part (r) in view of the fact that 


(-r)( p -i >/ 2 = i 


p-i 


is even 


p-1 


is of form In 


o p — 1 is of form 4n o p is of form An + 1 
<=> p = 1 (mod 4) 
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and 

_ 1 _ -I 

( — l ) (p “ 1)/2 = —1 <=> -is odd <=> -is of form 2n + 1 

2 2 

o p — 1 is of form 4n + 2 o p is of form An -f 3 
<=> p = 3 (mod 4). 

We have proved: 


3-7-5. Corollary. Suppose p is an odd prime; then 




if p = 1 (mod 4), 
if p = 3 (mod 4). 


We may illustrate the utility of this corollary by computing for each 
of the primes /? = 41, 47, 67, 179, 197, 601 and interpreting the result in 
various ways: (^y) = 1 since 41 = 1 (mod 4), so the congruence x 2 = 
— 1 (mod 41) has a solution; (yy) = — 1 since 47 = 3 (mod 4), so the congru¬ 
ence x 2 = —l (mod 47) has no solution; (^-y) = — 1 because 67 = 3 (mod 4), 
so the polynomial x 2 -f 1 is irreducible over Z 67 ; (fyV) = — 1 because 179 = 
3 (mod 4), so the element —1 is not a square in Z 179 ; (fVV) = 1 because 
197 = 1 (mod 4), so the element — 1 is a square in Z 197 ; (^yf) = 1 because 
601 = 1 (mod 4), so the polynomial x 2 + 1 e Z 601 [x] has two roots in Z 601 — 
that is, x 2 + 1 is reducible over Z 601 . 

Our next result provides a way, somewhat better than trial and error, to 
compute the Legendre symbol (-). However, its importance lies not in itself, 
but rather in the crucial role it plays in our derivation of the fundamental 
theorems about the Legendre symbol. 


3-7-6. Gauss 9 Lemma. Suppose p is an odd prime and pj( a. Consider the 
set 


Let n denote the number of least positive residues modulo p of the elements 
of S which are greater than pj2. Then 




(- 1 )". 
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Proof : First, let us clarify the definition of n and show, via some concrete 
examples, how Gauss’ lemma enables us to compute For each element of 
S take the smallest positive integer in its residue class (mod p ); in other words, 
for each i = 1,2 ,...,(/? — l)/2 apply the division algorithm for p and ia —the 
remainder is the smallest positive integer in \ja \ p , and is referred to as the 
“least positive residue” (mod p) of ia. [Clearly, p)(ia for/ = 1,...,(/? — l)/2, 
so we never have residue 0.] Thus, for each i= 1, ..., (p — l)/2 we obtain an 
integer from 1,2— 1. Counting how many of these are greater than 
pjl determines n. 

For example: For p = 13 and a = 3 we have (p — l)/2 = 6 , so here 
S={ 3, 6 , 9, 12,15,18} 

and the least positive residues (mod 13) of these integers are 

{3, 6 , 9, 12, 2, 5}. 

Two of these are greater than 13/2 = p/2; so n = 2, and Gauss’ lemma asserts 
that (yj) = (— l ) 2 = 1 . 

Similarly, when p = 13, a = 5 we have S = {5, 10, 15, 20, 25, 30}; so the 
least positive residues (mod 13) are (5, 10, 2, 7, 12, 6 }. Three of these are 
greater than 1 f 9 so n = 3 and (-j^) = (— l ) 3 = — 1. 

We now turn to the proof of Gauss’ lemma. Consider the least positive 
residues (mod p) of the elements of S. These (p — l)/2 elements are distinct, 
because no two of the (p — l)/2 elements of S are congruent (mod p). Let 
r l9 r 2 ,..., r n be the least positive residues which are greater than pjl , and let 
s i9 s l9 ... 9 s m be the least positive residues which are less than pjl. Of 
course, no least positive residue is equal to pjl , because pjl is not an integer. 
In particular, n + m=(p- l)/ 2 . 

The elements r l9 r 2 , ..., r n , s l9 s 2 ,..., s m are distinct. Thus, p — r l9 
p — r 2 , ..., p — r n are surely distinct, and 0 < p — r t < pjl for each / = 

1,...,« because pjl <r t <p. Consequently, all the elements 

p-r u p-r 29 ... 9 p-r n9 s l9 s 29 ... 9 s m (*) 

are greater than 0 and less than pjl [that is, they are greater than or equal to 1 
and less than or equal to (p — l)/2]. We assert that they are distinct. Forthis,it 
suffices to show that no p — r { can equal any s j . In fact, for any r ( and Sj there 
exist integers x 0 , y 0 with 1 < x 0 , y 0 <(p — l )/2 for which 

r { = x 0 a (mod p) and Sj = y 0 a (mod p) 

—so if p — r t = Sj , then 


p — x 0 a = y 0 a (mod p). 
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And then (x 0 + y 0 )a = 0 (mod p) which implies p | (x 0 + y 0 )- But this is 
impossible because 1 < x 0 + y 0 <p — l. The conclusion is: The set 
{p — r u .... p — r n , s u ^ s m ) with m + n =(p — l )/2 elements is precisely 
the set of integers (1,2,...,(/? — l)/2}. Therefore, [with congruences (mod/?)] 



i \ i n m 

-)! = n(p-»-«)ri s J 

/ i= 1 j= 1 


n m 

=(-ir rh-r Uj 

i=i j =i 


Since ((/? — l)/2)! is prime to /?, we may cancel and obtain 

1 =(-l ) n a (p - i)l2 (mo dp). 

From this we conclude 



= a (p ~ 1)/2 = (-!)" (mod p) 


and because and (—1)" are -F1 or — 1, equality holds. This completes the 
proof. | 

In preparation for our next result, which is rather technical in nature, we 
have need for a simple definition. For any real number a, let us denote the 
largest integer less than or equal to a by [a]. For example, [ ],which is known 
as the “greatest integer” function, takes the values: [y] = 3, [5] = 5, 
[12.7] =12, [tt] = 3, [-3]=-3, [—tt] = — 4, [-12.7] = -13—and, in 
general, [a] < a < [a] + 1. 

We shall require only one fact about [ ]. Namely—suppose 


b = qa + r, 0 <r <a 


is the division algorithm for a and b. Then b\a — q + ( rja ), and since 0 < 
r/a< 1, we have [b/a] = q. Thus, the division algorithm always takes the form 


b = 


m tr 

.a. 


a + r, 


0 < r < a. 


3-7-7. Lemma. Suppose p is an odd prime and p)( a. As in Gauss’ lemma, 
let n denote the number of least positive residues (mod p) of the elements 
of S = {la, 2a ,..., ((/? — l)/2 )a] which are greater than pjl. If we put 
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then 


(a-l) 


P 2 - 1 


= t — n (mod 2). 


Proof : As in the proof of 3-7-6, let r u r 2 ,..., r n be the least positive resi¬ 
dues (mod p) of S which are greater than pjl, and s u s 2 ,..., s m be the least 
positive residues (mod p) of S which are less than p/2. We have seen that 

{P ** 1 > • • • 9 P 9 $19 • • • 9 = 2 , . . . , “ ^ 

and consequently adding all the terms in each set yields 


The left-hand side is 


1 + 2 + 


(p-l)/2 n m 

E *= EG>- r d+ Ev 

i= 1 i= 1 / = 1 




p 2 - i * 


2 2 
while the right-hand side is np — £" =1 r ( + 2” =1 —so 

P 2 ~ 1 v ^ f 

—— = np- 2>,+ Z s f 

O 1=1 i= 1 


(*) 


Next, we observe that the division algorithm for p and fa takes the form 

r 

J p + rem f 

where the remainder terms rem f run over the set {r u ..., r n , s l9 ..., s m } as i 
runs from 1 to (p — l)/2. Therefore, 


'(+)--(T')-T 

=T([+H 

(r-W 2 fial (r-'V 2 

=p Z - + E 

/= 1 LPJ i=l 


= pt+ £r,+ E s / 

i= 1 i= 1 


(**) 


rem f . 
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Subtracting (*) from (**) gives 

(a ~ 1 ) ^ - - g ^ = pit -n) + 21 ]r t . 

Viewing this equation modulo 2 we obtain [since p is odd, and hence con¬ 
gruent to 1 (mod 2)], 


(a- 1 ) 



= t — n (mod 2). 


This completes the proof. | 


3-7-8. Theorem. If p is an odd prime, then 



(_1)(p 2 -i>/8 = 



if p = ± 1 (mod 8), 
if p = ±3 (mod 8). 


Proof : Putting a = 2 in 3-7-7, we have 


p 2 -l 


8 


= t — n (mod 2). 


Furthermore, when a = 2, 


+ 


fp-r 


Each of these terms is 0 because each fraction is less than 1. Hence, t = 0. Of 
course, — ft = ft (mod 2)—so making use of Gauss’ lemma, we have 


(|) = (-!)" = (-l ) <p2 


' l)/8 


The value of the Legendre symbol 0) depends, therefore, on whether 
(p 2 — l)/8 is odd or even. Now, any odd prime p is one of the forms Sn + 1, 
8 n + 3, Sn + 5, 8 n + 7—or to put it another way, p is of form Sn + 1, $n + 3, 
8« — 3, or 8« — 1. If p is of form 8 n ± 1 [that is, = ±1 (mod 8)], then 


p 2 - 1 (8 n± l) 2 - 1 

8 8 


64n 2 ± 16n 
8 


= 8n 2 + 2 ft 


which is even. On the other hand, if p is of form 8ft ± 3 [that is, p = +3 (mod 8)], 
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then 

p 2 - 1 (8n ± 3) 2 - 1 64ft 2 ± 48ft + 8 


which is odd. This completes the proof. | 

As illustrations of this result, we compute (|) for p = 41, 47, 227, 229; 
namely, (^-) = 1 because 41 = 1 (mod 8), ) = 1 because 47 = — 1 (mod 8) 

(tIt) = ~ 1 because 227 = 3 (mod 8), (-jig) = — 1 because 229 = 5 = 
— 3 (mod 8). 


3-7-9. Remark. Thus far in our investigation of the value of the Legendre 
symbol our attention has focused on and (|). Why this emphasis? 
The explanation is fairly obvious. If a is any nonzero integer, then its prime 
factorization is of form 

a = (±l)2 r °Ylq r i\ r o >0, r f >0, i = 1, ...,s. 

i= 1 

The sign of a determines the choice of + 1 or — 1. The exponent r 0 is greater 
than or equal to 0, rather than just greater than 0, because r 0 = 0 occurs 
when 2 )( a. The exponents are all greater than 0 because we include only the 
odd primes q t which divide a; in particular, because p)(a, the q t are odd primes 
different from p. The prime 2 has been distinguished from the other primes; 
this is due to a fact of life—the prime 2 behaves differently from the others. 
According to 3-7-4, the Legendre symbol is multiplicative—so 



Thus, to compute in general, it suffices to know [there is no need to 
bother with (-)—it is always equal to 1; and anyway when a is positive, the 
factor +1 should be ignored], and (^), where q is an odd prime different 
from p. Having already determined (-y-) and (-), it remains to consider 
For this we need one more result. 


3-7-10. Lemma. If p and q are distinct odd primes, then 

Proof: We give an elegant geometric proof, due to Eisenstein, for this 
apparently formidable formula. 
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In the Euclidean plane, R x R, consider the rectangle with vertices at 
( 0 , 0 ), (p/ 2 , 0 ), ( 0 , qj 2 ), (p/2, q/ 2 ), shown in the accompanying figure. 

Y 



Let us count the number of lattice points (meaning points both of whose 
coordinates are integers) inside this rectangle (but not on the edges). Since 
both p and q are odd, the set of lattice points clearly consists of all points (m, n) 
where m and n are integers satisfying 1 < m < (p — l)/ 2 , 1 <n<(q — l)/ 2 . 
The number of such points is obviously ((p — l)/2)((q — l)/2). 

We can also count these lattice points in another way. Consider the 
diagonal from (0, 0) to (p/ 2, q/2). The points on it (and inside the rectangle) 
are those points on the straight line y = qx/p whose coordinates satisfy 
0 < x < p/ 2, 0 < y < q/2. None of our lattice points is on this diagonal—in 
fact, if x is any integer i with 0 < i < p/2, the corresponding point on the 
diagonal is (/, qi/p), and qijp cannot be an integer because p and q are distinct 
primes and i < p. It suffices, therefore, to count the lattice points inside each 
of the triangles. 

Consider the “lower triangle” with vertices (0, 0), (p/ 2, 0),(/?/2, q/2). The 
way to count its lattice points is as follows: For each i = 1,2,...,(/?— l)/2 
consider the vertical line segment, inside the triangle, at x = i —this segment 
runs from (/, 0 ) to (/, qi/p), with endpoints excluded—and count the lattice 
points on it. This amounts to counting the number of integers greater than 0 
and less than qi/p. Since qi/p is not an integer, this number is precisely [qi/p\. 
Hence, the number of lattice points inside the lower triangle is 
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In similar fashion, to count the lattice points inside the “ upper triangle ” 
with vertices (0, 0), (0, qj2 ), (p/2 , #/2), one uses, for each i = 1, ..., {q — l)/2, 
the horizontal segment at y — i —which runs from (0, i) to (pi/q 9 i )—and 
ends up with 


(*-_y/ 2 r pi 


lattice points inside the triangle. The proof is now complete. | 


3-7-11. Quadratic Reciprocity Law. If p and q are distinct odd primes, 
then 


(_l)((P-l)/2)((<Z-D/2) 

| + 1, if p or q is = 1 (mod 4) 

\ — 1, if both p and q are = 3 (mod 4) 


Proof : Apply 3-7-7 for p and a = q. Since q is odd, 21 (q — 1); so the result 
is 0 = t — n (mod 2), or t = n (mod 2). In virtue of Gauss’ lemma, 




where t = 


(p-^)/2 r^r 
j=l lp\ 


Now, by interchanging the roles of p and q , we obtain 

(f) -(-*)■'■ where ''-‘X[f]' 


Therefore, in virtue of 3-7-10, 


(p-l)/2 (q-l)/2 

i n \ i n \ X, Lpifa1+ X t WlPi 

(£|(^)=(-1) 1 1 = 1 = (_l)C(P-l)/2)((«-l)/2) 


It remains to settle when Q(|) is +1 or — 1, and this obviously depends 
on whether {(p - l)/2)((^ - l)/2) is even or odd. Since ((p - l)/2 ){{q - l)/2) 
is odd <=> both (p — l)/2 and (q — l)/2 are odd <=> both p = 3 (mod 4) and 
q = 3 (mod 4), we see that (|)(|) = — 1 if both p and q are = 3 (mod 4), 
and (j)(|) = 1 in all other cases (that is, if p = 1 (mod 4) or q = 1 (mod 4), or 
both). This completes the proof. | ! 


The first complete proof of the reciprocity law was given by Gauss, who 
considered the theorem so important that he gave eight proofs altogether. So 
many proofs have been given over the years that their number is in doubt; 
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it has been asserted that there are more than 150 of them. For this reason, 
most books on elementary number theory give substantially the same proof 
(due primarily to Eisenstein), and so have we. 

Because the square of any Legendre symbol is 1, we may multiply through 
by (!) in the quadratic reciprocity law and arrive at the following formulation. 
If p and q are distinct odd primes, then 


( 


+ 


- 

at 

-) 


if p = q = 3 (mod 4), 


otherwise. 


For example, (yy) = (yy), because 97 = 1 mod 4); (ff) = — (-fj-) because 
23 = 3 (mod 4) and 58 = 3 (mod 4). 

In virtue of the reciprocity law, coupled with 3-7-5 and 3-7-8 [which enable 
us to evaluate (-y-) and (|)—and which are sometimes called “ complements ” 
or “ supplements ” of the reciprocity law] we can compute (^) in any concrete 
case, as will be seen in the examples which follow. 

The reason for the name “ quadratic reciprocity law ” is that it displays a 
reciprocity between the problems of existence of solutions for 


x 2 = q (mod p) and x 2 = p (mod q). 


More precisely, ifp = q = 3 (mod 4), then (2) = — (^), so (^) = 1 <=> (!) = — 1, 
and (j;) = — 1 <=> (~) = 1; hence, in this situation 

x 2 = q (mod p) has a solution o x 2 =q (mod p) has no solution 


and 


x 2 = q (mod p) has no solution o x 2 = p (mod q) has a solution. 

On the other hand, if p or q is congruent to 1 (mod 4), then 0) = (0, so that 
(|) = 1 <=> (|) = 1 and (l) = — 1 o (j) = — 1; hence, in this situation 

x 2 = q (mod p) has a solution o x 2 =p (mod q) has a solution 

and 


x 2 = q (mod p) has no solution o x 2 = p (mod q) has no solution. 

We are in no position to discuss the immense significance of the quadratic 
reciprocity law in more advanced branches of mathematics. All we can do 
is give some examples to illustrate the kinds of questions which can now be 
answered. 
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3-7-12. Examples. (1) What is the value of(~4T2.)? This involves a straight¬ 
forward computation 



[3-7-4, part (m)], 

[3-7-5, 41 = 1 (mod 4)], 

[3-7-11, 41 = 1 (mod 4)], 

[3-7-4, part(ii)], 

[3-7-4, part (it>)], 

[3-7-4, part (i)], 

[3-7-8, 23 = —1 (mod 8)]. 


In particular, the congruence x 2 = —23 (mod 41) has a solution. 

(2) Is —59 a square in Z 131 ? This depends on the value of (ttt)> so we 
compute 



[3-7-4], 

[131 = 3 (mod 4)], 

[59 s 3 (mod 4), 131 s 3 (mod 4)], 
[3-7-4], 

[13 = 1 (mod 4)], 

[3-7-4], 

[13 = 1 (mod 4)], 
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'3' 

7, 



[3-7-4], 


[3-7-4], 


[7 = — 1 (mod 8)], 
(trial and error). 


Therefore, — 59 is not a square in Z 131 . 

Another way to do this computation is to start from — 59 = 72 (mod 131) 
—so 



(3) Is the polynomial x 2 + 189 irreducible over Z 491 ? We compute— 
starting with the observation that 491 is prime, and 189 = 3 3 • 7— 

\ 491 / \491/ \491 / \491 / \491 / 


= (-l) 




[Among the facts used in this computation are: 491 = 3 (mod 4), (§) = — 1 
because x 2 = 2 (mod 3) has no solution, 7 = 3 (mod 4).] This shows that 
x 2 + 189 e Z 491 [x] has a root in Z 491 , so it is not irreducible. Of course, our 
computation here (as is true, in general, when the Legendre symbol is com¬ 
puted) only settles the existence question for a root; it provides no information 
about locating a root. 
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(4) For which odd prime p is 3 a quadratic residue ? We must determine all 
p for which (^) = 1. [Of course, the case p = 3 is excluded since quadratic 
residue and Legendre symbol are defined only when pj( a —see 3-7-2.] By 
quadratic reciprocity, 


= (_l)((3-l)/2)((p—l)/2) = (_l)(p-l)/2 


or, upon multiplying both sides by (f) 

( 5 ). 

Now, (|) = 1 in exactly two situations: 


(0 (- i) (p ~ 1)/2 = (|) = l , 


(«) (-D (p - i)/2 =d 


Recalling that 

f_n(p-D/2 = ( l > (mod4) f 

l —1, if p = 3 (mod 4), 

and noting that for an odd prime p ^ 3 we have clearly 


//A f 1, if/ 7 — 1 (mod 3), 
\3/ j-1, if p = 2 (mod 3), 


so that the two situations in question are: 


(i) p = 1 (mod 4) and p = 1 (mod 3), 

(ii) p = 3 (mod 4) and p = 2 (mod 3). 


- 1 . 


Solving the simultaneous congruences by inspection (it is not worth the effort 
to use the Chinese remainder theorem) we have 

(/) p = 1 (mod 12), (ii) p = 11 = — 1 (mod 12). 

This shows 



= l o p = 


±1 (mod 12). 


(5) For which primes p is the polynomial x 2 + 3 reducible over Z p ? 

This is another way of asking when x 2 + 3 has a root in Z p , or when does 
the congruence x 2 = — 3 (mod p) have a solution. For p = 2 and p — 3 the 
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congruence surely has a solution—so suppose p > 3. We must determine the 
primes p for which = 1. Now, 


(tHt)© 


_ ( _ 1 )(P- l)/2( _ 1)((P- l)/2)((3 - l)/2) 


= ( _ 1)(P~ l)/2( _ typ- l)/2 / £ 


and: (f) = 1 for p = 1 (mod 3), (f) = — 1 for p = 2 (mod 3). Thus, except for 
the primes 2 and 3, (^) = 1 <=> p = 1 (mod 3) (that is: ) = 1 <=>p is of 

form 3 n + 1). 

(6) When is — 2 a quadratic residue (mod p) ? We note first that the 
Legendre symbol (^) = (-^)(|) equals 1 under two sets of circumstances 



In virtue of 3-7-5 and 3-7-8 these become: 


(/) p = 1 (mod 4) and p = ± 1 (mod 8), 
(ii) p = 3 (mod 4) and p = ±3 (mod 8). 


Since p = 1 (mod 4) and p = — 1 (mod 8) are impossible simultaneously, 
(i) becomes p = 1 (mod 8)* In similar fashion, (ii) becomes/? = 3 (mod 8). Thus, 
— 2 is a quadratic residue (mod p) <=>/?= 1 or 3 (mod 8). 


(7) What is the value of (|) (p an odd prime, not equal to 5)? By quadratic 
reciprocity, 


Therefore, 


=( _ 1 )((5- 1 )/2)(( P -i)/2) =1 


/5\ = (p\ = | 1 if P = ± 1 (mod 5), 
\p) \5/ ( — 1 ifp=±2 (mod 5). 


There is another kind of question for which the techniques of this section 
may be applied. Long ago, in 1-4-5, we proved that the number of primes of 
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form 4« + 3 is infinite—but at that time we were unable to prove the infinitude 
of primes of form 4n + 1. With the tools now at our disposal, this difficulty is 
easy to overcome. 

3-7-13. Proposition. The number of primes of form 4n -F 1 is infinite. 

Proof : Suppose there are only a finite number of such primes; denote 
them by 

Pi <Pi < ••• <Pm- 

Consider the integer 

N = (2pip 2 ■ ■ ■ p m ) 2 + 1. 

It is of form 4n + 1, so there surely exists an odd prime p which divides N. 
The fact that p divides (2p 1 p 1 • • • p m ) 2 -F 1 may be restated as 

(2-PiP 2 • • • Pm) 2 = -1 (mod p). 

In other words, the congruence x 2 = — 1 (mod p) has a solution, and therefore 
(-y ) = 1. But (■“) = 1 =>p = 1 (mod 4) => p is of form 4 n + 1 => p is one of the 
p t . This is a contradiction because no p t divides N. | 

3-7-14. Exercise. The Legendre symbol was defined only when p is an 
odd prime with p)(a. We extend the Legendre symbol to other “denomina¬ 
tors,” as was done by Jacobi (1804-1851). 

(1) Let a and b be relatively prime integers with b odd and positive. Thus, 
the prime factorization of b looks like b = p x p 2 * • • p r , where the p t are odd 
primes which need not be distinct. Define the Jacobi symbol by 



where is the Legendre symbol. For example, if a = 13, b = 17,325 = 
3 • 3 • 5 * 5 • 7 • 11, then (a, b) = 1 and 

(w 5 ) = ©Q(!)(f)(T)(n) 

Of course, the Jacobi symbol (|) is always ± 1. 

When b is an odd prime p the Legendre symbol and the Jacobi symbol 
are indistinguishable; but no harm can come of it since they are equal— 
that is, they take the same values. The Jacobi symbol is said to be an “ex¬ 
tension” of the Legendre symbol, in the sense that it is defined for more 
cases, and when both are defined they are equal. 
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(2) With a and b = p Y p 2 • • • p r as above, we say that a is a quadratic 
residue (mod b) when the congruence x 2 = a (mod b) has a solution. [Obvious¬ 
ly, this definition can be made even when b is not odd; it is customary to talk 
about quadratic residues (mod b) so long as b is positive and ( a , b) = 1.] We 
have then : a is a quadratic residue (mod b) =>a is a quadratic residue modulo 
each p i => each (^) = 1 => (|) = 1. However, the converse is false; for example, 



= (-!)(-!) = 1 


but since x 2 = 2 (mod 3) has no solutions, it is immediate that 2 is not a 
quadratic residue (mod 33). Of course, — 1 implies a is not a quadratic 
residue (mod b ). 

(3) The Jacobi symbol has the following elementary properties. If b and b ' 
are positive odd integers and (aa\ bb ') = 1, then 




(4) Furthermore, the quadratic reciprocity law and its complements carry 
over to the Jacobi symbol—namely, if a and b are positive odd integers with 
{a, b) = 1, then 


( o (-2) =(-i) (,, “ 1)/2 , 


GO (^)=(-i) (62 - 1)/8 , 


{in) ^ ^ = ( _ i)«— 0/2). 
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The proofs make use of the elementary facts that for odd integers u and v we 
have 


u — 1 
2 


+ 


v — 1 
2 


Ml? — 1 

—-— (mod 2) 


and 


u 2 — 1 


8 


+ 


t? 2 -l 

8 


( uv ) 2 — 1 
8 


(mod 2) 


—and then, inductively, these statements extend to n odd integers u u u 2 ,... 
u n , giving 


n 


£ 

r = 1 


~ 1 (fl »i) ~ 1 
2 2 


(mod 2) 


w 


I 

f= 1 


u? - 1 (Ft u >) 2 - 1 

8 8 


(mod 2). 


(5) Note: If we had defined the Jacobi symbol by 


1, if a is a quadratic residue (mod Z>), 

— 1, if a is not a quadratic residue (mod Z>), 

then the reciprocity law would not hold. For example, if a = 5, b = 21, then 
(—l) ((fl “ 1)/2)((b “ 1)/2) = 1, (£) = (y) = 1 because x 2 = 21 = 1 (mod 5) has a 
solution, and = — 1 because x 2 = 5 (mod 21) has no solution (in 

fact, it suffices to observe that x 2 = 5 (mod 3) has no solution). 

(6) The computation of a Jacobi symbol is straightforward; for example, 
a = 231 and b = 1105 are relatively prime odd integers, and 

_ m\ = « = = /iLU_L) = (±) = 

\1105/ \ 231 / \231/ \181/ \181/ Vl81/\181/ U81/ 



3-7-15 / PROBLEMS 

Throughout p is understood to be an odd prime with p\ a. 

1. List all the quadratic residues and all the quadratic nonresidues for each 
of the following primes: 

(/) 17, (») 19, ( iii) 23, (to) 29, (a) 31. 
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2. Find all solutions of 


x 2 = a (mod 13) 

for each of the following values of a: 

(0 1 , (ii) 3, (iii) 4, (iv) 5, 

(v) 8 , (vi) 9, (vii) 10, (viii) 12. 

3. Do Problem 2 for 

x 2 = a (mod 13 2 ). 

4. Prove that the congruence x 2 = a (mod p) has no roots, or else it has two 
distinct roots. In the latter case, how are the two roots related ? 

5. Prove the reduction step of 3-7-1 by working with congruences (rather 
than over Z p ). 

6 . We have seen that a = b (mod p) => (£) = Prove this by viewing a and b 
in Z p . 

7. Use Gauss’ lemma, 3-7-6 (and nothing else) to compute the Legendre 
symbol in the 24 cases given by 

a = ±2, ±3,5,6, p= 11, 17, 19,23. 

8 . Use our general results to evaluate for each of the 25 cases: 

a= -1, ±2, ±3, p = l, 11, 13, 17, 19. 

9. Show that we always have a (p “ 1)/2 = ± 1 (mod p) (where, as usual, p is an 
odd prime and pjfa). 

10. Prove that of the nonzero elements of Z p , half are squares in Z p and half 
are not squares. In other words, half of the integers 1, 2,..., p — 1 are 
quadratic residues (mod p) and half are quadratic nonresidues. 

11. (/) Show that: If both a and b are quadratic residues (mod /?), then so 

is ab. 

(ii) If both a and b are quadratic nonresidues (mod /?), then ab is a 
quadratic residue. 

(iii) What happens if one is a quadratic residue and the other is a quadratic 
nonresidue ? 

12. Decide if the congruence x 2 = a (mod p) has a root for each of the 36 
cases given by: 

a= 5, -7, 11, -11, 13, -13, 

1.3 Prove that 1^1 (£) = 0. 


p = 97, 101, 103,617,619,911. 
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14. Decide if the following congruences have solutions—and how many? 
(/) x 2 = 2 (mod 47), (ii) x 2 = —2 (mod 47), 

(iv) x 2 = — 2 (mod 67), 

(vi) x 2 = —2 (mod 94), 

(viii) x 2 = —2 (mod 134), 

(x) x 2 = — 2 (mod 268). 


(w) x 2 = 2 (mod 67), 
(v) x 2 = 2 (mod 94), 
(vii) x 2 = 2 (mod 134), 
(ix) x 2 = 2 (mod 268), 


15. For any r > 1 prove that x 2 = a (mod p r ) has a solution ox 2 = a (mod p) 
has a solution. In such a situation, how many solutions does x 2 = 
a (mod p r ) have ? 


16. (/) Suppose p and q are distinct odd primes with (^) = (|) = — 1. Show 
that x 2 = a (mod pq) has no solutions. 

(ii) What happens if = 1 and (j) = — 1 ? 


17. Compute the Legendre symbols: 

m)- <“)(!)• <i,,) (^- 8 )’ 

(no (I), wo(^), 

<■*) m «iV/©■ <-> m 


18. (/) Suppose p and q are distinct odd primes, both of which are congruent 
to 3 (mod 5). Show that if x 2 = q (mod p) has no solutions, then 
x 2 = p (mod q) has two solutions. 

(ii) What if both p and q are congruent to 1 (mod 4) ? 


19. Starting from Wilson’s theorem, prove that 




p 1)12 = — 1 (mod p). 


20. In Z p , the product of all nonzero elements which are squares is 


(_l)(P + D/2 = 


1, if p = 3 (mod 4), 


1—1, if p = 1 (mod 4). 

Restate this result in terms of quadratic residues and congruences. 

21. Find all primes for which 

x 2 = 11 (mod p) 

has a solution. (Excluding/? = 2 or 11, there are 10 classes of such primes.) 
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22. Find all primes p for which 



23. Find all odd primes for which both 2 and 3 are quadratic residues. 

24. Find all primes p for which both 



25. (/) Prove that the number of primes of form 3n — 1 is infinite. 

(ii) Prove that the number of primes of form 3w + 1 is infinite. 

26. Prove that the congruence 

x 2 = — 3 (mod 7 2 • 19 2 -23) 
has no solutions; but that the congruence 

x 2 = -3 (mod 7 2 • 19 2 • 31) 
has exactly eight solutions (do not try to find them). 

27. Suppose ( a , b) = 1 where b is odd and positive. Show that the congruence 

x 2 = a (mod b) has a solution = 1 for every prime p which divides 

b. In such a situation, how many solutions does the congruence have? 

28. For which values of c does the congruence 

3x 2 — 2x + c = 0 (mod 7) 

have a solution? How many solutions does it then have? 

29. (0 Does x 2 = 30 (mod 91) have a solution? How about x 2 = 44 (mod 91)? 
(ii) For how many a ’s does the congruence 

x 2 = a (mod 91) 
have a solution? Find at least 10 of them. 

30. Without trying to solve, decide how many solutions there are to the con¬ 
gruence 

37x 2 + 348x + 311 = 0 (mod 617) 


(617 is prime). 
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31. 


If for the congruence ax 2 + bx + c = 0 
note the discriminant, then it has 

no solutions 
exactly one solution 


(mod p) we let A = b 2 — 4 ac de- 



o P I A, 


two distinct solutions 
Give a concrete example of each type. 



1 . 


32. (/) If both a and b are quadratic residues (mod p ), then ax 2 * = b (mod p) 
has a solution. 

(«) What if both a and b are nonresidues ? 

(m) What if one is a residue and the other a nonresidue? Give a concrete 
example of each type, and exhibit a solution when one exists. 


33. Suppose q is the smallest positive integer which is a quadratic nonresidue 
(mod p)\ prove that q is prime. Furthermore, q 9 2q ,..., (q — 1 )q are all 
less than p —hence, q <\Jp 4* 1. 


Miscellaneous Problems 


1 . 


Solve the following pairs of simultaneous 
(0 5x -F I2y = 15 (mod 22); («) 

13x -ly = 19 (mod 22). 

(in) LU17* - IJLl= 1 16 * 1 > 7 ; (iv) 

[_§Jl7* + I -* 1 17 y = i 3 1 )7 . 


congruences or equations: 
Sx = 4 (mod 14); 

5x = 3 (mod 11). 


!_4_|l8* “ — 1 ^ 1 18 9 

I 8 lig* + 1 5 1 1 «>■ = I 3 Ira- 


(v) 36x s 27 (mod 45); 
35* = 27 (mod 45). 


2. By using the method of 3-1-7, prove the following generalization of the 
Chinese remainder theorem. The system of linear congruences x = a { 
(mod mi ), / = 1 ,..., n has a solution <=> {m jy m k ) \ (aj — a k ) for all 
j # A:, y, k = 1, ..., n. Furthermore, the solution is then unique mod 
[m u ..., m n ]—that is, if x 0 e Z is a solution then [ *o | rm ^ ..., w 3 is the set 
of all solutions. 


3. Suppose m u m 2 , ..., m n are relatively prime in pairs, and consider the 

system of linear congruences, 

ape = b t (mod m f ), i = 1, ..., «. 

Find a necessary and sufficient condition that this system have a solution. 

How many solutions are there? What happens if m u m 2 ,..., m n are not 

relatively prime in pairs? 
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4. Prove that the number of fractions ajb (with a , b both positive integers) 
in lowest terms, of value 1 or less, and with denominator m or less, is 
0(1) + 0(2) + • • • + (j)(m ). 

5. Prove that for m > 1, the sum of the integers from the set {1, 2, ..., m} 
which are relatively prime to m is 

6. If the positive integer a is fixed, prove that the equation $(m ) = a has at 
most a finite number of solutions for m. 


7. Suppose d\m and d # m; show that 


m — <j)(m ) > d — 0(d). 

8. Prove that if m and n are positive integers with (m 9 n) =d then 

d(j)(m)(j)(n) 


(j)(mri) = 


m 


9. Show that the Euler 0-function satisfies 

where g is the Mobius function (see Chapter II, Miscellaneous Problem 56) 


10. Suppose / is a function defined on all positive integers n 9 with f(ri) > 0. 
Prove that 

g(n) = n/(<0-/(«) = n g(df n,d) . 

d\n d\n 

11. Consider polynomials /(x), g(x) e i?[x], where R is a commutative ring 
with unity, and the composite polynomial g(f(x)) e jR[x]. [For example, 
if f(x) = 3x 3 — 1, g(x) = x 2 + x + 1 in Z[x], then g(f(x)) =(/(x)) 2 
+f(x) + 1 = (3x 3 - l) 2 + (3x 3 - 1) + 1 = 9x 6 - 3x 3 +1.] Denote the 
derivative by ' (see 3-3-15, Problem 24). Show that 

W(x))Y = [g'(m)\ •/'(*). 

12. The element c of the field F is said to be a root of multiplicity r ^ 1 of the 
polynomial /(x) e F[x] when (x — c) r is the power of (x — c), which 
appears in the prime factorization of /(x)—or, what is the same thing, when 

(x - c) r I f(x), but (x - c) r+ 1 Xf(x), 

When r = 1, c is said to be a simple root, and when r > 1, c is said to be a 

multiple root. 

(/) c e F is a multiple root of /(x) o c is a root of (/(x),/'(x)), the gcd 
of/(x) and its derivative/'(*)• 

(ii) Suppose c is a root of/(x), then c is a multiple root of/(x) o c is 
a root of f'(x). 
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13. For a polynomial f(x ) e Z p [x], show that f'{x) = 0 o f(x) is a poly¬ 
nomial in x p . [That is, f(x) is of form + a i xP + ci 2 x 2p + • • • + a r x rp ; 
one also denotes this by f(x) = g(x p ), where g(x) e Z p [x].] 

14. Let R be an arbitrary ring; it need not be commutative or have a unity. 
Show that i?[x] (by which is meant the set of all expressions a 0 + a^x 
+ * * * + a n x n with coefficients in R ) is a ring with respect to the standard 
operations. 

What about i?[[x]], the set of all formal power series (see 3-3-15, 
Problem 25), when the ring R is arbitrary? 

15. How many elements are there in the ring i?[x]? What is the characteristic 
of this ring? What is its center? 

16. Suppose R is a commutative ring with unity, then so is jR[x]. Taking 
another indeterminate y, we can form the polynomial ring CRM)[y] in 
y over i?[x]—denote it also by i?[x, y]. Generalize this to form a commu¬ 
tative ring with unity x 2 ,..., x n ] for each n > 1. What are its units? 
What if R is not commutative? 

17. Suppose R is a commutative ring with unity. If a is a unit in R and b is 
an arbitrary element of R , define the mapping 0 : R[x] -► R[x] by 

= Rax + b), f{x) e i?[x]. 

Show that (j) is an automorphism of the ring i?[x]. 

18. We describe a somewhat more formal method for defining a polynomial 
ring than the one given in Section 3-3. 

(/) Suppose the ring R with unity 1, is given. Let S denote the set of all 
infinite sequences 

a = ( a o > a u a 2 > a 3 > • • •) 

of elements a { from R such that only a finite number of the a { are 
not equal to 0 (in other words, almost all a t equal 0). If /? = (b 0 , 
b 2 , b 3 ,...) is also an element of S define the sum of a and /? by 

a + P = (a 0 + b 0 , £i + b u a 2 + b 2 , ...) 

and their product by 

:= (^o > ^1? ^2 ? ^3 ? • • *)? 

where, for / = 0, 1,2,..., 

i 

c i = a 0 b t + a 1 + ••• +a t b 0 =^a r fe i _ r = J^a.b,. : 

r=0 r+s=i 

Then a + j? and aft are elements of S, and S is a ring with unity 

0,0, o, 0,...). 
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(ii) Let R 0 denote the subset of S consisting of all elements of form 
(a, 0, 0, 0,..., 0, ...), a e R. Then R 0 is a subring of S which is 
isomorphic to R under the mapping 

a -> (a. 0, 0, 0, ..., 0,.. .), ae R. 

We identify R and R 0 —and denote (a, 0, 0,..., 0,...) also by a. 
Thus, R is viewed as a subring of S. 

(<Hi ) Let us denote (0, 1, 0, 0,..., 0,...) by x. So x e S. Then we have 
x 2 — ( 0 , 0 , 1 , 0 , 0 ,..., 0 ,...) and, inductively, 

x l = ( 0 , 0 ,..., 0 , 1 , 0 ,..., 0 ,...), 


where the 1 is in the (i + 1) place. For any a t e R c S we have 
a t x l = ( 0 ,..., 0 , a t ,0,...), 

and consequently 

(a 0 ,a 1 ,a 2 ,a 3 ,...) = a 0 + a t x + a 2 x 2 + ••• = £ a,*'. 

* 

Now, denote S by jR[x] ; show that jR[x] has all the properties we want 
it to have (that is, as in Section 3-3). 


19. Let F be a field. For ae F, be F* let us denote ab~ l = b~ l a by ajb ; in 
particular, b~ i = l/b and Ofc -1 = 0/b = 0. Verify the following for a , 
ceF y b,de F*. 


(0 

(iii) 

(v) 


a ad 
b ~ bd 



( a X _ ^ 
W = V 9 


(ii) 7 = - o ad = be. 
b d 

ac a c ad + be 

Ml' (w) b + d = bd ' 

n> 0. What happens if n is negative? 


20. Suppose an integral domain D is contained in the field F ; we say that F 
is a quotient field of D when every element of F can be expressed in the 
form alb = ab~ i with a , beD. We show that an arbitrary integral 
domain D has a quotient field (details are left to the reader). 

Let D f denote the set of nonzero elements of D. Consider the product 
set 


D x D f = {(a,b)\ae D.beD'}. 

Of course, equality for elements of D x D' is defined by 
(i a , b) = (c, d) o a — c. b — d. 
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Define operations on D x D' as follows: 

(a, b) + (c, d) = (ad + be, bd), (a, b) • (c,d) = (ac, bd ). 

Clearly, D x D f is closed under these operations. 

We want (a, b) to behave like the “ fraction ” ajb, but also need to take 
into account the expectation that a fraction can be written in more than 
one way. To do this, define a relation = on D x D' by 

(a, b ) = (c, d) <=> ad = be. 


Then = is an equivalence relation; denote the set of equivalence classes 
by D x D', and the equivalence class to which (a, b) belongs by [ (a, b) | . 
Note that 


If we put 


(a, b) = (c,d) o ad = be. 


(a, b) I + I (c, d) I = I (ad + be, bd) I, 


(a,b)\ • I (c, d) I = I (ac, bd) 


then + and • are well-defined operations under which D x D' becomes a 
field (call it F) —the zero element is | (0,1) | ; the additive inverse of 
| (a, b ) | is | ( — a,b) | ; the identity for multiplication is | ( 1 , 1 ) \ = | (a,a) | , 

a # 0. _ 

Define a mapping of D D x D' = F by 


a -> 1 ( a , 1 ) . 


It is an injective homomorphism. Hence, the field F contains an isomor¬ 
phic copy of D . We identify D with its image, so D a F, and | (a, 1) | is 
simply a. For b e D', \ (1, b) \ is then b ~ 1 = 1 jb, so every element | (a, b ) [ 
of F is of form ab ~ x , a e D,b e D' . Thus Fis indeed a quotient field of D. 

In particular, in the case when D = Z we have constructed the rational 
field, Q. 


21. The quotient field of an integral domain D is unique. More precisely, 
suppose F and F' are quotient fields of D , then there exists an isomor¬ 
phism ^ of F onto F', which is the identity on D [that is, ij/(a) = a for all 
aeD]. 

More generally, suppose the field F t is a quotient field of the integral 
domain D t , i = 1, 2. If (j) is an isomorphism of D t -► D 2 then there exists 
an isomorphism i//: F t -+ F 2 , which is an extension of 0 [that is, i //(a) 
= <f)(a) for all aeD^]. Prove these statements. 
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22. If D is an ordered domain then so is its quotient field F (meaning that F 
satisfies the order requirements of an ordered domain) when the set of 
positive elements is taken as {alb | a, beD, ab > 0}. In particular show 
that Q is an ordered field. 

23. (/) Let a and /? be distinct elements of the ordered field F. Show that 

there exists an element of F between them; even more, there are an 
infinite number of elements between them. 

(«) Prove that the positive elements of an ordered field are never well 
ordered. 


24. If the ordered domain D is “ archimedean ” (see 2-4-10, Problem 17) 
then so is its quotient field. 

25. Fix a polynomial g(x)eF[x] of degree 1 or more. Show that an arbitrary 
element f(x) of F[x] can be expressed uniquely in the form 

fix) = r 0 (x) + r l (x)g(x) + r 2 (x)g(x) 2 + • • • + r m (x)g(x) m , 

where r,„(x) F 0 and deg(rj(*)) < deg((y(x)) for / = 0, 1,... ,m. In analogy 
with the situation in Z, one may refer to this expression as the expansion 
of f(x) in “base g(x ).” 

26. Consider the polynomial ring F[x] over a field F. It is an integral domain, 
so it has a quotient field, called the field of rational forms (or functions) 
over F and denoted by F(jc). The elements of F(x) are “fractions”—in 
fact, 

aix) 


fix), gix)eF[x], g(x) # 0 V. 



Given an element 


let 


m 

gix) 


eFix), 


gix) = pfxT'PlixT 2 -- •Prixf r = fl Piixf 1 

1 = 1 

be the factorization of g(x) into distinct prime (that is, irreducible) poly¬ 
nomials Pi(x ),.. .p r (x) in F[x]. Then f(x)/g(x) can be written in the form 


m 

gix) 


= hix) + 


n i 


z 


hujix) ^ h 2 ,jjx) 
Piixf j = j p 2 ix) j 


+ 


+ M Prix) J 


where h(x) and all h itJ (x ) are in F[x] and 

deg (hi /x)) < deg(pi(x)) | \ 

Lj = 1 , 2 
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Moreover, this expression for f(x)jg(x) is unique. 

In the case where F is the real field R we know (see 3-5-25) that an 
irreducible polynomial must be of degree 1 or 2 ; so the numerators in 
the expression for f(x)lg(x) are of degree 0 (that is, constants) or 1. This 
expression plays a key role in calculus when showing how to integrate an 
arbitrary rational function. 

27. Suppose a and b are elements of the domain D ; then for relatively prime 
integers m and «, we have 

a m = b m , a n = b n => a = b. 

28. Can you prove 3-5-5 by using unique factorization into primes? 

29. Give examples of the following: 

(/) Two units whose sum is a unit. 

(ii) Two units whose sum is not a unit. 

(iii) Two zero-divisors whose sum is a zero-divisor. 

(iv) Two zero-divisors whose sum is not a zero-divisor. 

30. Prove the following statement: In F[x ], the polynomial f(x) = a 0 -b a t x 
-b • • • + a n x n of degree n is irreducible o the 66 reverse ” polynomial 

/*(x) = a n -b a n _ X x -b-b a x x n “ 1 -b a 0 x n is irreducible. 

31. For any prime p , consider the cyclotomic polynomial 

f(x) = x p ~ 1 +x p ~ 2 -b • • • + * + 1 eZ[x\. 

It satisfies (x — 1 )f(x) = x p — 1. Substitute^ -b 1 for x and use the Eisen- 
stein criterion to show that f(x) is irreducible over Q. Justify all steps 
carefully. 

32. Find the gcd of 

f(x) = x 5 + x 3 + 3x 2 — 2x + 6 , g(x) = x 6 + 3x 4 + x 3 + 5x 2 + 2x -b 6 , 

and express it as a linear combination of f(x) and g(x) when they are 
viewed over the field 

(i)Q, (ii) Z 2 , (iii) Z 3 , (iv) Z 5 , (v) Z 7 , (vi) Z u . 

Can you find the prime factorizations of these polynomials over the given 
fields? 

33. Over which fields Z p does x 2 -b x + 1 divide x 5 + x + 1 ? 

34. (/) For each n > 1 consider the polynomials 

60x" - 143, and 60*" + 91x + 143. 
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Show that neither one has a rational root. 

(ii) Find all n > 1 and positive integers a < 500 for which the poly¬ 
nomial 60x" — a has a rational root. 

35. Determine all rational numbers for which the polynomial f(x ) = lx 2 — 5x 
takes an integer value. 

36. If g(x) | f(x) in Z[x ], show that g(c) \f(c) for every ce Z. Given f(x)e Z[x] 
of degree n make use of this fact and Lagrange interpolation to construct 
an algorithm which, for each r <n, determines (in a finite number of 
steps) all polynomials of degree r in Z[x] which divide f(x). 

37. If p is prime, make use of Lagrange interpolation to show that every 
element of Map(Z p , Z p ) is a polynomial function over Z p . In other words, 
the homomorphism of rings 

A:Z P M -* Map(Z p ,Z p ) 

is surjective. What is the kernel of A? Show that ker A consists of all 
polynomials in Z p [x] which are divisible by x p — x. Furthermore, A 
provides a 1-1 correspondence between the set of all polynomials in 
Z p [x] of degree less than p and Map(Z p , Z p ). 

38. Show that the polynomial functions over any infinite field F (or integral 
domain D) form an integral domain. 

39. Let F be a finite field with n elements—say, F= {a i ,a 2 ,-.., a p } and put 

<Kx) = fl (x - a t ) e F[x], 

1 = 1 

[For example, if F — Z p , then (j)(x ) = x p — x.] If f(x ), g(x) e F(x) determine 
the same polynomial function—that is, if / = g —then <j>{x ) divides 
f(x) — g(x). What about the converse? What is the kernel of the homo¬ 
morphism A:F[x] -► Map(F, F)1 

40. For the field F consider the domains 

(0 F[x], (ii) F(x), (iii) F[[x]~] (see 3-3-15, Problem 25). In 
each case, decide if the map described by f(x) ->f(—x) is an automorphism. 

41. Define a mapping of Z[x] -► Z p [x] by: 

fix) = Z a iX l ->f(x) = z 
1=0 /=0 

Show it to be an epimorphism. What is the kernel? Making use of the 
map A: Z p [x] -► Map(Z p , Z p ) show that 

a ((Twr) = a (ow) = a(/w). 
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42. Show that the greatest integer function [ ] satisfies the following proper¬ 
ties; for a, P e R and n e Z 
(i) [a] < a < [a] + 1 , 

(//) a — 1 < [a] < a, 

(iii) 0 < a — [a] < 1 , 

(iv) [a + n] = [a] + n , 

(t» [a] + M < [a + ffl < [a] + [ffl + 1, 

(w) | a - [a] - i| < i, 

o»> [«]+[-«] = j ° if “ eZ : 

( — 1 otherwise, 

(p/ii) [2a] — 2[a] = if M is » dd ' 

1 1 (0 if [2a] is even, 

(ix) [—1 = [-1 , n positive, 


( viii ) [ 2 a] — 2 [a] 


(&) m =p 

L n J |_« 


(x) [a + ^]is the nearest integer to a. If two integers are equally close 
to a, [a + i] is the bigger one. 

(*Q Pa] + [ 2 j 8 ] > [a] + [ff] + [a + fl 

43. If ra and n are positive integers with (ra, «) = 1 show that 

J 1 p j = (” - W” - 0 

Furthermore, show that if (m, ri) =d then 


(m — 1 )(« — 1 ) d — 1 
2 + 2 ' 


44. Denoting, as usual, the number of positive divisors of the integer i by 
t( 7), show that 


" " r n\ 

E«0 = E n 2 • 

/=! d= 1 L«J 


45. For any positive integer n and any a e R, show that 


r n 

E « + - = 

i=i L «J 


46. Consider the real number a; then 

(/) a e Q <=> there exists m eZ for which [ma] = ma. 

(«) a e Q o there exists me Z for which [ra!a] = m\oc 
(Hi) We know that e, the natural base of logarithms, is given by 

00 1 1 
e — =l + l + zr + — 

2 ! 
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For any positive m show that 

/ m i \ 

[mle\ = m\ l Y —I <m\e 
V»=o »!/ 

Hence, e is irrational. 


47. (/) Making use of the v p notation from Section 1-5, show that the ex¬ 

ponent to which the prime p appears in the factorization of w! is 

( ii ) Use part (ix) of Problem 42 to simplify the computation of v 7 (50,000!). 

48. If the expansion of n in base p is n — b 0 + b±p -F • • • -F b s p s then 

v p (n !) = "-(bo+^ + '-'+bs). 

49. If (a, m) — 1 then, according to 3-2-22,^ (m) = l(mod m). If s is the smallest 
integer for which a s = l(mod m ), show that s \<j>(m). 

50. (0 If (m, 133) = («, 133) = 1, then 133 | (m 18 - n 18 ). 

(ii) For every n > 0, we have 1010101 (« 37 — n). 


51. Suppose m is odd; then: 

(/) The sum of the elements in any complete residue system mod m (see 
3-2-22) is = 0(mod m). 

(ii) The sum of the elements in any reduced residue system mod m is = 
0 (mod m). 

52. For an odd prime p , show that, 

(P-D/2 

PI ( 2/) 2 = 2 2 • 4 2 • • • (p - l) 2 = (- 1)(* +1 )/ 2 (mod p). 

i= 1 

53. For every p > 3, show that the sum of all squares in Z p is 0; in other 
words, the sum of all the quadratic residues is congruent to 0 (mod p). 


54. If p )( a, p )(m and the integer n is arbitrary prove that 



it being understood that (£) = 0 . 

55. If p > 1 there are consecutive quadratic residues (mod p) and consecutive 
quadratic nonresidues (mod/?)—in other words, there exist integers 
1 < a, b < p — 1 for which 
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56. Suppose the r consecutive integers a, a+l,...,a + r — 1 have the same 
quadratic character (mod p ), while a — 1 and a + r do not have this 
quadratic character—that is, 

Then r < a . 

57. Suppose a> \!p and (|) = — 1; then there is no sequence of a consecutive 
quadratic residues (mod /?), nor is there a sequence of a consecutive 
quadratic nonresidues (mod /?). 

58. Consider the sequence 1,2— 1 where p is an odd prime. Let 


A =the number of pairs a, a + 1 with ^ = 1 and 

B = the number of pairs a, a + 1 with ^ = — 1 and j = — 1, 

C = the number of pairs a, a + 1 with ^ = 1 and + j _ i ? 

Z> = the number of pairs a, a + 1 with = — 1 and ( a + ^ = 1. 

\PJ \ P / 

(/) Prove that 

skm'-t 2 ))—“ s® - (•-?))-*-» 

(//) For each r = 1, 2,1 let 

Then/(r) = /(l), and by evaluating Yf r I j f(r) it follows that/(1) = — 1. 
Consequently, 


A + B — C — D = -1. 


(tii) Show that 


A = -- - --, B = C — D = P A when p= 1 (mod 4) 


p — 3 

A = B = D = — , C = 


4 

/>+l 


when p = 3 (mod 4). 




376 


III. CONGRUENCES AND POLYNOMIALS 


59. Suppose a is odd; show that: 

(0 x 2 = a (mod 2 ) has exactly one solution. 

(ii) x 2 = a (mod 2 2 ) has a solution <=> a = 1 (mod 4); and in this case 
there are exactly two solutions. 

(iii) x 2 = a (mod 2 3 ) has a solution o a = 1 (mod 8 ); and in this case 
there are exactly four solutions. 

(iv) If s> 3 and x 2 = a (mod 2 s ) has a solution c s then x 2 = a (mod 2 S+ x ) 
has a solution of form c n+i = c s + t2 s ~ i . 

(t?) For any n>3,x 2 = a (mod 2") has a solution <=> = 1 (mod 8 ); 

and in this case there are exactly four solutions. 

60. Suppose (a, m) = 1 and we write the prime factorization of m in the form 

t 

m = 2 ro J~I p ri r 0 > 0, r { > 0 for / = 1 , ..., t . 

i = 1 

Find a necessary and sufficient condition for the congruence x 2 = a 
(mod m) to have a solution. Show that if a solution exists then the 
number of solutions is 


2* when r 0 < 1 , 

2 t+1 when r 0 = 2 , 

2 t+2 when r 0 > 3. 

61. Consider the ring of quaternions, Q (see Miscellaneous Problem 55 of 
Chapter II). 

(0 Show that it is a division ring (by which is meant that it satisfies all 
the requirements for a field except the commutative law for multi¬ 
plication). 

(ii) Find the center of Q. 

(iii) Show that the equation x 2 = — 1 has an infinite number of solutions 
in Q. 

62. Consider the domain Z[ V — 6 ] = {a + &V — 6 1 a, b e Z}. For each element 
oc = a + b\/ — 6 ofZ[V — 6 ] call a = a — by/ — 6 the conjugate of a, and 
define the norm of a by 


iV(a) = aa = a 2 + 6b 2 . 


Prove the following: 

(0 N( 1) = 1; N( ajj) = N(a)N(P) for al la, p e Z[>/^ 6 ]. 

(ii) a is a unit <=> N(oc) = 1; so Z[ V — 6 ] has only the two units ± 1. 
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(iii) 2 , 3 , 5 , V — 6 , 2 + \/6 are prime elements of Z[V — 6 l- Of course, 
an element n is said to be prime when it cannot be expressed in the 
form n = aft where neither a nor /? is a unit of Z[V — 6 ]. 

(iv) If a # 0 is not a unit in Z[V—6] then it can be written as a finite 
product of primes. 

(V) In Z[V — 6 ], factorization into primes need not be unique—since, 
for example, 

6 = 2-3= V -6 • and 10 = 2 • 5 = (2 + V^ 6)(2 - V^ 6 ). 

63. For a = a + b V — 5 in the domain Z[V — 5] write oc = a — V — 5 and 

iV(a) = aa = a 2 + 5b 2 . Then: 

(0 N( aff) = N(<x)N(p) for aha, fi e Z[V^5]. 

(ii) The only units of Z[V —5] are ± 1 . 

(iii) Factorization into primes exists for elements of Z[V — 5]. 

(iv) The following are primes: 2,3,7,1 + V — 5, 2 + V — 5, 3 + V — 5, 
1 ± 2 V"— 5, 2 + 3 V--5. 

(*;) Factorization into primes need not be unique—for example, 

9 = 3 • 3 = (2 + V--5)(2 - V — 5), 

6 = 2 • 3 = (1 + V"—5)(1 - V —5), 

14 = 2 • 7 = (3 + V--5)(3 - V—5), 

21 = 3 - 7 = (1 + 2 V--5)(l - 2 V — 5), 

49 = 7 • 7 = (2 + 3 V —5)(2 - 3 y/~5). 

64. Consider the domain Z[\/5] = {a + b V51 a, be Z}. For a = a + b V5 

put a = a — b Vs and iV( a ) = a 2 — 5b 2 . Then: 

(0 2 + y/5 9 9 + 4 V5, 2 — y/5y 9 — 4 V5 are units of Z[V5]. 

(//) For every ne Z, +(2 + V5)" is a unit. 

(iii) The elements 3 + 2 V5, — 4 + V5, —13 + 6 V5 are associates in 
Z[\/5]. 

(iv) a is a unit o N(a) = +1. 

0) If N(a) is a prime in Z then a is a prime of Z[ V5]. 

(w) The elements 3 + 2 V5, -4 + V5, -13 + 6 V5 are primes of 
Z[\/5] (they should really be viewed as the same prime) and so are 
+ 2, 3 + y/5. 

(vii) In Z[\f5\ factorization into primes exists, but is not unique [witness 
4 = 2 • 2 = (3 + V5)(3 - \/5)]. 
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65. Let 


1 + s/5 
"““ 2 “ 

and consider the integral domain Z [co] = {a + bco | a, b e Z}. It contains 
the domain Z[V5]. Let 

_ 1 - n/5 

co = -y-, 

and for a = a + boo put a = a + bco. Define N(a) = oca = a 2 + ab — b 2 e Z. 

(i) The map a a is an automorphism of Z [co]. 

(ii) If a e Z[ V5] c: Z[co] then N(<x) coincides with the norm of a defined 
in the preceding problem. 

(Hi) N(aP) = N(a)N(P) for all a, p e Z[co]. 

(iv) a is a unit <=> A^(a) = ± 1. 

(v) Every unit of Z[V5] is a unit of Z[a>]. The converse is false—in fact, 
co is a unit of Z[co], and so is ±co" for every neZ. 

(vi) If iV(a) is prime (in Z) then a is prime (in Z[co]). 

(vii) Factorization into primes exists in Z[co]. 

(viii) Are 2, 3 ± V5 primes of Z[co]? 

(i ix ) Is factorization into primes unique in Z[co]? 

66. For a = a + bi in the domain of Gaussian integers Z[/], put N( a) = 
( a + bi){a — bi) = a 2 + b 2 . Then: 

(0 N(*P) = N(a)N(P). 

(ii) There are four units: ±1, +/. 

(Hi) N( a) prime in Z => a prime in Z[i]. 

(iv) 2 = (1 + /)(1 — 0; this is a factorization of 2 into primes (1 + 0 
and (1 — 0> which are associates. 

(v) 3 is prime in Z[i\; so is 7. More generally, any prime p = 3 (mod 4) 
is a prime of Z[i]. 

(vi) We have the factorization into primes 5 = (2 + 0(2 — 0- How 
does it compare with the prime factorization 5 = (1 — 20(1 + 20? 

(vii) 13 factors into primes; in fact, 

13 = (3 + 20(3 - 20 = (2 + 30(2 - 3i). 

Compare these factorizations. 

(viii) Any prime p = 1 (mod 4) factors as the product of two distinct 
primes of Z[/]. 

(ix) Factorization into primes holds in Z[/]. 

(x) What about uniqueness? 
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67. (/) The polynomial f(x ) = x 2 + 1 is irreducible over Z 3 . Extend the 
procedure of 3-2-23, Problem 28 and 3-5-29, Problem 27, by using 
an a which satisfies a 2 + 1 = 0, to construct a field, Z 3 [a], with 9 
elements. 

(//) Use the irreducible polynomial f(x) = x 3 — x 2 -f 1 over Z 3 to 
construct a field with 27 elements. 

(///) Consider an arbitrary field F and an irreducible polynomial f(x) 
= a 0 -f a x x + * * * + a n x? over F. Let £ be a formal symbol which is 
taken to satisfy /(£) = a 0 -f -f • • • + a n ^ n = 0. Generalizing the 
above, show how 

^Kl = {c 0 + Ci{ + ••• + c„_ 1 r _1 |c 0 , c i ,...,c„- l eF} 
can be made into a field which contains F. 




GROUPS 


In the preceding chapters, we studied algebraic systems with two opera¬ 
tions—namely, the particular objects Z and Z m , and the general objects 
known as “rings.” In this chapter, we turn to the study of logically simpler 
algebraic systems, known as “groups”—they are algebraic objects with a 
single operation. 

It is a matter of pedagogical taste whether one does groups or rings first. 
Most algebra books follow the rather natural path of increasing complexity 
(that is, of more and more axioms), so they do groups first. Our approach, on 
the other hand, has been based on the idea of moving from the familiar to the 
unfamiliar and from the concrete to the abstract; thus, we were led to study 
the integers first, then rings, and then groups. Of course, when everything is 
said and done, one ends with the same body of knowledge, no matter what 
order is used for organizing the material. 
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The contents of this chapter are almost entirely pure algebra. As the facts 
about groups are developed, we shall often observe analogies with facts we 
already know about rings. Although groups are of interest in themselves and 
for various applications, our interest stems primarily from the number- 
theoretic connections. These connections involve Z* (the set of units of Z m ) 
which is a group whose structure we try to determine. 


4-1. Basic Facts and Examples 

Suppose G is a nonempty set on which we have a binary operation •. We 
recall (see 2-1-2) what this means—namely, we are given a mapping from 
G x G -► G which assigns to each ordered pair ( a,b)eGxG an element 
a • beG. The notation • for the operation is fairly common but, of course, any 
other notation is equally valid. In practice, we shall usually drop the • and 
write ab instead of a • b. 

In virtue of our past experience, the kinds of properties of this operation 
which interest us are: 

(1) closure (this is really redundant, because closure is included in the 
definition of an operation), 

(2) the associative law, 

(3) the existence of an identity, 

(4) the existence of inverses, 

(5) the commutative law. 

These properties appear as ingredients in our first result (and in all subsequent 
results too), which combines the basic definition of a group with consideration 
of equivalent formulations for its axioms. 


4-1-1. Theorem. Suppose {G, •} is a semigroup; by this we mean that • is 
an operation on the nonempty set G with the properties 

(1) G is closed under the operation, 

(2) the associative law holds. 

Then the following conditions are equivalent, and when any one of them 
holds, {G, •} is said to be a group. 

! {i) G has an identity e; that is, ea — ae = a for every aeG. 

(ii) Every element of G has an inverse; that is, given any aeG 
there exists a! eG for which a!a — aa! = e. 

j j. f For any choice of a, b e G the equations ax — b and ya = b have 

[solutions in G for x and y. 
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( (/) G has a left identity e ; that is, ea = a for every aeG. 

(ii ) Every element of G has a left inverse; that is, given any 
aeG there exists a' eG with da — e. 

1 (0 G has a right identity e\ that is, ae — a for every aeG. 

(ii) Every element of G has a right inverse; that is, given any 
aeG there exists d eG with ad = e. 

In addition, if the commutative law (that is, ab = ba for all a, be G) 
holds in a group, or a semigroup, we say it is abelian—after Abel (1802— 
1829). 


Proof : We need only prove 

as the proof of I => II => III' => I will then go in the same way. 

I=>II: Given a,beG there exists, by I(//), an inverse d of a —so da = 
ad = e. Then x = db and y = bd are solutions of ax = b and ya = b , 
respectively, since [making use of the associative law and 1(0] we have 

a(db) = (ad)b = eb = b and (bd)a = b(da) — be — b. 

Note how this proof requires e to be an identity on both right and left, 
and d to be an inverse of a on both right and left. 

II => III: Fix an aeG. Applying II for b = a, there is a solution in G of the 
equation ya = a; call it e —so ea = a. Now, e is a left identity for this a , but 
this does not say that the same e is a left identity for every element of G. Thus, 
we would like to show that for any beG we have eb — b. For this, let c be a 
solution of ax — b (its existence is guaranteed by II). So ac = b and (using the 
associative law) 

eb = e(ac) — (ea)c — ac — b. 

Hence, e is indeed a left identity for every element of G. Furthermore, given 
aeG there exists, according to II, an d eG such that da = e; so a has a left 
inverse. 

Note how this proof requires both parts of II; that is, the solvability of 
both ax — b and ya = b. 

III => I: We have, by hypothesis, ea — a for every aeG and the existence of 
d eG for which da = e. We want to prove that ae = a for every aeG (so e 
is an identity on both sides) and ad = e (so d is an inverse on both sides). For 
this, in virtue of III(«), let a" be a left inverse of d —so d'd = e. Then using 
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the associative law and the fact that e is a left identity, 

ad = e(ad) = (dd)(ad) = a"[a'(aa')\ 

= a''[(a'a)a'] = d'(ed) = a"a' = e , 

so d is also a right inverse of a. Now, e is also a right identity, since 
ae = a(a'a) = (aa')a = ea = a. 

This completes the proof. | 

We shall usually be somewhat careless and refer to a group G instead of 
{G, •}. The operation •, which is obviously called “multiplication,” will be 
taken for granted; it may be commutative or it may not. On occasion, the 
symbol + will be used for the operation (in which case, one naturally calls the 
operation “ addition ”). It is a strong convention in this subject that when + 
is used, the operation is understood to be commutative. This is entirely con¬ 
sistent with our experience with rings, where addition is always commutative. 

4-1-2. Examples. (1) The integers under addition, { Z, +}, are clearly an 
abelian group; 0 is an identity element for this operation, and —a is an inverse 
for ae Z. 

(2) The integers under multiplication, {Z, •}, satisfy closure, associa¬ 
tivity, the commutative law, and have an identity element 1—so it is an abelian 
semi group with identity. But it is not a group because not every element has 
an inverse. 

(3) For any integer m> 1, consider {Z m , +}, the integers modulo m 
under addition. This is an abelian group; 0 = |_0j w , is an identity element, and 
1 —a \ m is an inverse for [a_ | m e Z m . 

(4) The integers modulo m under multiplication, {Z m , •}, is an abelian (or 
simply, commutative) semigroup with identity |_lj m . It is not a group, because 
not every element has an inverse. 

(5) More generally, consider any ring { R , +, •}. If we recall the axioms for 
addition, they assert precisely that {jR, +} is an abelian group; it is known as 
“the additive group of the ring.” What about { R , •}, the set R under the 
operation of multiplication ? In virtue of the axioms concerning multiplication 
in a ring, closure and associativity are satisfied—so {jR, •} is a semigroup. This 
semigroup has an identity if and only if the ring {jR, +, •} has a unity (that is, 
identity) for multiplication. However, { R , •} can never be a group, for even if 
an identity exists the element 0 can never have an inverse under the operation 
of multiplication. 

There are many additional examples of groups to be given, but we defer 
jthem for a moment to discuss the elementary facts about groups (all of which 
are really known to us from the work on rings and fields). 
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4-1-3. Proposition. In any group G = { G , •} we have: 

(1) The generalized associative law holds; that is, for any a u a 2 , ..., 
a n eG the meaning of a t a 2 • • • a n is unambiguous—it does not matter 
how parentheses are inserted. 

(2) The identity e is unique. 

(3) The inverse of any a e G is unique; we denote it by a" 1 . 

(4) The cancellation laws hold; more precisely, left cancellation says: 
ab = ac=>b — c, and right cancellation says: ba = ca=>b = c. 

(5) The solution of each of the equations ax = b and ya = b is unique. 

(6) For every a e <7, (a~ i )~ i = a. 

(7) The inverse of a product is the product of the inverses in the reverse 
order; that is, 

( ab)~ 1 = b~ 1 a~ i for all a,beG. 

(8) The standard definitions of powers and their properties apply; thus 
for any aeG and n e Z, 

a 0 = e, a~ n = (a" 1 )", a m a n = a m+n 9 (a m ) n = a mn . 


Proof: The details could safely be left to the reader, but for the record we 
comment briefly on each statement. 

(1) This goes exactly like the proof of the generalized associative law for 
multiplication in a ring, 2-4-1. Since only the associative law is used in the 
proof, this assertion is obviously valid in any semigroup. 

(2) If e' e G is also an identity, then e = e'e = e’. 

(3) If both a ' and a" are inverses of a, then 

d = de = d(ad) = (a , a)a ,f = ea" = a". 

(4) If ab = ac , then b = {a~ x a)b = a~ i {ab) = a~ 1 (ac) = (<a~ 1 a)c = c . Simi¬ 
larly, ba = ca=>b = c. 

(5) If both c t and c 2 are solutions of ax = b , then ac x = b = ac 2 , so by 
cancellation, c i = c 2 . Thus a~ 1 b is the unique solution of ax = b. Similarly 
ba ~ 1 is the unique solution of ya = b. 

( 6 ) Since {a~ 1 )~ l {a~ 1 ) = e — aa~ i 9 cancellation gives (a~ i )~ i = a. 

(7) By definition, (. ab)(ab ) -1 = e . On the other hand, , 

(ab)(b~ i a~ 1 ) = ((ab)b~ 1 )(a~ i ) = (a(bb~ 1 ))a~ i 
= (ae)a~ i = aa~ i = e. 

So, by cancellation, (aZ ?) -1 = b~ i a~ 1 . 

( 8 ) Most of the preceding assertions were proved in 3-2-5 when we were 
discussing the units of a commutative ring with unity. Even though our group 
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need not be commutative, the story about exponents still goes as in 3-2-5, part 
(9), because we are concerned here solely with the powers of a single element 
and these always commute with each other. | 

4-1-4. Remark. If the group G is also abelian, then we have additional 
properties such as the generalized commutative-associative law (see 2-4-2), and 
{ab) n = (fb n for all a,beG,ne Z. The verifications should cause the reader 
no difficulty. 

When the abelian group G is written additively (that is, with operation -f) 
some of the properties of 4-1-3 take the following forms: the identity 0 is 
unique; the inverse of any a e G is unique, we denote it by —a; the cancella¬ 
tion law (because of commutativity, only one cancellation law is needed) 
holds—that is, a + b = a + c=>b = c; the equation a -f x = b has a unique 
solution, namely, b — a; —( — a) = a for every ae G; —(a + b) — —b — a = 
— a — b ; for m,ne Z, fl,AeG, we have 0 a = 0, (—ri)a = n(—a ), (m + n)a = 
ma + na , m(na) — ( mn)a , m(a + b) = ma + mb. The reader may well find these 
statements rather boring—after all, we proved them in a ring (more precisely, 
for the additive group of an arbitrary ring) in Sections 2-1 and 2-4. 

We have seen in 4-1-3 that the cancellation laws hold in a group, and that 
they play a key role in proving various properties. The question naturally 
arises if the cancellation laws can be used as replacements for the decisive 
group axioms. More precisely: given a semigroup in which both cancellation 
laws hold, is it a group? The answer is in the negative, as may be seen from 
a simple example. Consider {2Z, •}, the even integers under multiplication. 
Closure, associativity, commutativity, and cancellation are clear—so this is 
a commutative semigroup with both cancellation laws. But it is not a group; 
there is no identity, and of course inverses cannot exist. 

On the other hand, as our next result indicates, there is one situation in 
which the cancellation laws guarantee that a semigroup is a group. 

4-1-5. Proposition. Suppose G is a finite semigroup. If both cancellation 

laws hold, then G is a group. 

Proof : Suppose G consists of n elements, and write 
G = {c u c 2 ,..., c „}. 

Thus, c l9 c 2 , ...,c n are distinct and every element of G appears among them. 
Now, let any a, be G be given. Consider ac t , ac 2 , ..., ac n ; these n elements 
are distinct since, by left cancellation, ac t = acj => c { = Cj => i = j. Because G 
has n elements, we must have 

G = {ac u ac 2 ,..., ac n }. 
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Since the element b appears in this set, we conclude that the equation ax — b 
has a solution in G. 

In similar fashion, using right cancellation, the equation ya = b has a 
solution in G. Thus, Axiom II of 4-1-1 is satisfied and G is a group. 

It is worthwhile to give a slightly more abstract formulation of this proof. 
For any agG, consider the mapping 

4> a :G^G 

defined by 

4> a (x) — ax , x e G. 

In other words, (j) a amounts to multiplication by a , on the left. By left cancella¬ 
tion, cj) a is one-to-one [that is, </> fl (xi) = (/) a ( x 2 ) z=>ax i = ax 2 =>x l = x 2 ]. 
Because G is finite, the map must be onto G. Therefore, any b e G is in the 
image of —so the equation ax = b has a solution. Similarly, to show that 
ya = b has a solution, one works with right multiplication by a; the notation 
might look like ij/ a : G -► G where \lt a (y) = ya, ye G. | 

We turn next to additional examples of groups. 

4-1-6. Example. Consider (R, •} , the reals under multiplication. Obviously 
this is a commutative semigroup with identity 1; however, as in 4-1-2, part 
(5), 0 cannot have an inverse, so this is not a group. 

What happens if we throw out 0, and consider the set of all nonzero reals ? 
Denoting it by R* = R —(0), we note that in {R*, •} we have closure (since 
the product of two nonzero real numbers is not zero), the associative law, the 
commutative law, the identity, 1, and every a e R* has an inverse for multi¬ 
plication, namely a~ x = 1 jae R*. Thus, {R*, •} is a commutative group. 

More generally, consider any field F = (F, +, •}. As before, {F, •} is not 
a group because 0 has no inverse. If, as usual, we let F* denote the set of all 
nonzero elements of F, then F* is closed under multiplication because in a 
field (which is an integral domain) the product of two nonzero elements is not 
zero. Moreover, according to the definition of a field, 3-2-7, every element of 
F* is a unit—that is, has an inverse for multiplication. It follows that F* = 
{F*, •} is a commutative group; it is known as 66 the multiplicative group of the 
field F.” Of course, F* may also be characterized as the set or (group) of units 
of F. 

In particular, if p is prime, then Z p is a field and in virtue of the foregoing, 
Z*, the set of all nonzero elements (or equivalently, the set of units) of Z p is a 
group with p — 1 elements. 

4-1-7. Examples. For any m > 1, consider the commutative ring with 
unity Z m = { Z m , +, •}. As noted in 4-1-2, { Z m , + } is an abelian group, but 
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{ Z w , •} is not a group. In order to extract a multiplicative group from Z^, 
we must obviously include only elements which have an inverse under 
multiplication. Therefore, let us consider Z*, the set of all units of Z m (that 
is, all elements of Z m which have a multiplicative inverse). It is straightforward 
to verify that { Z*, •} is an abelian group with (j)(m ) elements, but there is no 
additional effort involved in treating this in a more general context. 

Thus, let R = {i?, +, *} be any commutative ring with unity, and denote the 
set of all units of R by R*. (This is consistent with the earlier notation when R 
is a field F . There F* is defined as the set of all nonzero elements—which is the 
same as the set of all units.) Consider {F*, •}. Clearly, e (the identity) belongs 
to jR*, and the associative law for multiplication holds for elements of R*. 
Furthermore, according to 3-2-5, parts ( 8 ) and (7), the product of units is a 
unit—so jR* is closed under multiplication—and any unit has an inverse which 
is also a unit—so every element of R* has an inverse in R*. Thus, {/?*, •} is an 
abelian group; it is known as the group of units of the ring R. 

Incidentally, the notion of “ group of units ” carries over to an arbitrary 
ring R with unity e, even if it is not commutative. More precisely, we have 
ea — ae — a for all a e R, and we say that a e R is a unit when there exists 
d e R with ad = da = e. The properties of units then go as in 3-2-5, and it 
follows easily that the set of all units (which is once more denoted by R*) is a 
group under multiplication. In more detail: If a, be R*, then (ab)(b'd) = e 
and (b'a')(ab) = e, so ab e R* and R* is closed under multiplication; associa¬ 
tivity is clear; obviously e e R*; finally, if a e R* y then ad = da = e which 
also says that d is a unit—so a has an inverse in R*. 

4-1-8. Example. Consider the field of complex numbers C. Every element 
of C is uniquely of form z = x + yi where x,ye R, and we have the notion of 
absolute value of a complex number—it is defined by 

M = s/x 2 +y 2 . 

Thus, if we view the complex numbers, geometrically, as the plane R x R, 
then \z\ represents the distance from the origin ( 0 , 0 ) to the point (x, y). 

Among the significant properties of absolute value one finds: 

(i) \z\ = yjzz where z = x — yi is the complex conjugate. 

(ii) \z\ > 0; \z\ = 0 <=> z = 0. 

(in) \z x z 2 \ = \z^ | \z 2 \ 9 z u z 2 e C. 

(iv) | z t + z 2 1 < \z t \ + |z 2 |, z u z 2 e C. 

The verification of these facts is entirely mechanical—except for (iv) 9 which 
is known as the triangle inequality—and the details may be left to the reader. 
We shall make heavy use of (in), which says that the absolute value is multi¬ 
plicative. 
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Let us look at 


W= {z g C \ |z| = 1}. 

In words, W consists of all complex numbers of absolute value 1. Geometrical¬ 
ly, W consists of all points whose distance from the origin is 1—that is, W is 
the circle of radius 1 with center at the origin. This explains why W is known 
as the unit circle. 

We show W is a group under multiplication. If z u z 2 e W 9 then \z 1 z 1 \ = 
|zi| |z 2 | = 1 • 1 = 1, so Z!Z 2 e W and we have closure. The associative law is 
trivial because we are dealing with the product of complex numbers. The ele-. 
ment 1 = 1+0/ belongs to W, so there is an identity. Finally, consider z e W; 
clearly z ^ 0 so, because C is a field, z has a multiplicative inverse—call it 
z~ i (we are not interested here in what z -1 looks like). From 1 = zz -1 , we have 

1 = |1| = |zz -1 | = |z| |z -1 | = |z -1 |. 

Thus, z~ 1 e W 9 and z has an inverse in W. Since multiplication of complex 
numbers is commutative, we have proved that W is an abelian group (often 
called the circle group) under multiplication. 

4-1-9. Example. In ordinary 3-space consider a plane on which an equi¬ 
lateral triangle has been drawn, and suppose the vertices are labeled as in the 
accompanying diagram. Now, imagine a cardboard copy of this triangle 


l 



placed over it. By a rigid motion of the triangle we shall mean any operation of 
removing the cardboard from the plane, moving it around in space (we even 
permit the cardboard to pass through the plane—that is, our plane is concep¬ 
tual rather than physical) and then placing it once more on the triangular area 
in the plane. The end result of a rigid motion finds the cardboard in the same 
place as before except that its vertices may not be where they were originally. 
For example, the vertex of the cardboard which was at 1 may end up at 
either 1, 2, or 3. (It should be understood that the labels 1, 2, 3 are considered 



4-1. BASIC FACTS AND EXAMPLES 


389 


part of the plane, and they never move!) Obviously, a rigid motion is de¬ 
scribed completely when we know how the vertices are moved. It now follows 
that there are exactly six rigid motions of the equilateral triangle; in fact, tak¬ 
ing the vertex of the cardboard at 1 there are three choices as to where it can 
end up, and then there are two choices as to where the vertex at 2 ends up. 

Let us introduce a notation for a rigid motion a. If, for example, a moves 
the vertices in the following way 

1 3, 2 ->2, 3 —> 1, 

then we shall write 



In other words, underneath each vertex we list the vertex at which it ends 
under a. The six rigid motions of the equilateral triangle may therefore be 
denoted as follows. 



Note that e is an “ identity motion ” in the sense that under it each vertex ends 
up where it started. 

There is another kind of motion of our triangle which is of interest. By a 
symmetry of the equilateral triangle we mean a rotation of the cardboard 
triangle about some axis in such a way that it ends in the same place as before 
—but, of course, the vertices may have moved. Obviously, any symmetry is a 
rigid motion. To describe symmetries of our triangle consider the accompany- 


l 



ing picture where the dotted lines are altitudes (or, equivalently, perpendicular 
bisectors, medians, angle bisectors). Consider the symmetry obtained by 
taking as axis the line through 0 perpendicular to the plane of the triangle and 



390 


IV. GROUPS 


rotating about this axis by 120° in a counterclockwise direction. The action 
of this symmetry on the vertices is clearly that of a x . Similarly, by rotating by 
240° in a counterclockwise direction about this axis, we have the motion o 2 . 
Furthermore, the symmetries which involve flipping the triangle over (that is, 
rotating by 180°) with respect to the altitudes through points 1,2, or 3, are 
clearly x u x 29 and t 3 , respectively. Finally, e should be viewed as a trivial 
symmetry under which nothing moves. 

Thus we see that the equilateral triangle has exactly six distinct symmetries 
—so for the equilateral triangle, the set of symmetries is the same as the set of 
rigid motions. If we denote this set by G = {e, a l9 o 2 , t 1? t 2 , t 3 }, then G 
becomes a group when multiplication is defined as the composition of map¬ 
pings. In more detail, for any (j,reG, gx is taken to mean a ° x —that is, 
apply the rigid motion x and then apply a to the result. 

Now, G is closed under multiplication because the composition of two 
rigid motions is obviously a rigid motion. (If the elements of G are viewed as 
symmetries, it is not obvious that the composition of two symmetries is a 
symmetry. But we now know this, precisely because G is closed.) It is also 
instructive to verify closure concretely by constructing the multiplication table 
—it takes the form: 


0 

e 

0-1 

<y 2 

Tl 

t 2 

t 3 

e 

e 

0-1 

<y 2 

Tl 

t 2 

t 3 

o-i 

o-i 

<y 2 

e 

t 3 

Tl 

t 2 

02 

02 

e 

o-i 

t 2 

T 3 

Tl 

Tl 

Tl 

t 2 

t 3 

e 

0-1 

<y 2 

r 2 

t 2 

t 3 

Tl 

o 2 

e 

°i 

r 3 

t 3 

t k 

t 2 

o-i 

<y 2 

e 


The reader is advised to derive the table on his own, to see how things go. For 
example, to compute 


^ 2^2 “ ^2 ° “ 



2 3\ 
1 2 


we note that <r 2 maps 1 3 and then t 2 maps 3 1, so t 2 g 2 maps 1 -> 1. 

Similarly, <r 2 : 2 -► 1 and t 2 : 1 -► 3, so t 2 g 2 : 2 -> 3; and finally <r 2 : 3 -> 2 and 
t 2 : 2 —> 2, so t 2 g 2 : 3 —> 2. Thus, 


'’'’’“(I 3 2 )='" 


The other products go in exactly the same way. 

The associative law can be verified by brute force with the aid of the table; 
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but surely no one will actually undertake to check all possible cases. Instead 
we note that because we are dealing with mappings the associative law for the 
operation of composition is automatic—in fact, for < 7 , t, p e G both a © (t o p) 
and (<j © t) © p are given by the rule: first apply p, then apply t, and then a; 
so (j ° (t o p) = (<t o t) o p. (This fact has already been observed in 2-5-12, 
Problem 5, and it will be discussed in more detail in 4-1-11.) 

Clearly, e is a left identity, and from the table every element of G has a 
left inverse (this is so because e appears in every column of the table). Of 
course, the existence of a left inverse is also obvious from the geometry—any 
rigid motion can be followed by another one that returns all vertices to the 
original position. Hence, G is a group, known sometimes as the “group of 
the equilateral triangle.” It is not commutative—for example, t 1 (j 2 = t 3 
while a 2 — t 2 . 

4-1-10. Example. Consider the set of all rigid motions of the square as 
shown in the accompanying figure. Any rigid motion is completely determined 



by what happens to the vertices at 1 and 2. Since the vertex at 1 can go to any 
one of four places and the vertex at 2 can then finish in either of two places, 
these are a total of eight such rigid motions. They may be denoted by 

_ /I 2 3 4\ /I 2 3 4\ 

e [l 2 3 4/’ (2 3 4 1/’ 

_/l 2 3 4\ _/l 2 3 4\ 

^ \3 4 1 2)' ^~\4 1 2 3/’ 

_/l 2 3 4\ _/l 2 3 4\ 

Tl \4 3 2 \)’ t2 ~\2 1 4 3/’ 

_/l 2 3 4\ _[\ 2 3 4\ 

Pl \1 4 3 2/’ p2 \3 2 1 4/ 
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It is tedious but straightforward to derive the multiplication table: 


° 

e 

Oi 

02 

O3 

Ti 

r 2 

Pi 

P2 

e 

e 

0 L 

o 2 

O3 

Ti 

r 2 

Pi 

P2 

o L 

CTi 

02 

O3 

e 

Pi 

P2 

r 2 

Tl 

0 2 

o 2 

O3 " 

e 

o L 

r 2 

Ti 

P2 

Pi 

O3 

<*3 

e 

Oi 

o 2 

P2 

Pi 

r 1 

r 2 

T l 

Ti 

p2 

r 2 

Pi 

e 

o 2 

03 

Oi 

t 2 

t 2 

Pi 

Ti 

p2 

o 2 

e 

Oi 

o 3 

Pi 

Pi 

Ti 

P2 

r 2 

Oi 

03 

e 

o 2 

p 2 

p 2 

t 2 

Pi 

Ti 

03 

Oi 

o 2 

e 


By using the same techniques as in 4-1-9, one sees that these eight rigid motions 
form a nonabelian group. It is known as the group of the square and also as the 

octic group. 

What about the symmetries of the square ? Naturally, since any symmetry 
is a rigid motion there are at most eight of them. Now consider the accom¬ 
panying picture, where the horizontal and vertical lines through 0 pass 



through the midpoints of the opposite sides. Except for the identity symmetry 
e , we have the following additional symmetries: 

(/) Rotation by 90°, counterclockwise, about the axis through 0 perpen¬ 
dicular to the plane of the square; this is cq. 

(ii) Rotation by 180°, counterclockwise, about the same axis; this is <j 2 . 

(iii) Rotation by 270°, counterclockwise (or by 90°, clockwise), about the 
same axis; this is <j 3 . 

(iv) Rotation by 180° (that is, flipping the square over) with respect to 
the vertical axis /; this is t x . 

(v) Rotation by 180° (that is, flipping the square over) with respect to the 
horizontal axis /'; this is t 2 . 

(vi) Rotation by 180° (that is, flipping) about the diagonal passing through 

1 and 3; this is p t . 

(vii) Rotation by 180° (that is flipping) about the diagonal passing through 

2 and 4; this is p 2 . 
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Consequently, there are eight symmetries and they constitute the same 
group as the rigid motions of the square. 

4-1-11. Example. Consider an arbitrary set X = {x, y, z, ...} and let S x 
denote the set of all permutations of X. Intuitively, a permutation of A" is a 
rearrangement of shuffling of the elements of X; formally, the appropriate 
way to formulate this notion is to define a permutation as a mapping of X -> X 
which is 1-1 and onto. Thus, we may write 

S x — {(7 e Map(A", X) \ g is 1-1 and onto}. 

Let us define the product of two permutations to be their composition. 
In other words, for <j, t e S x we take their product to be g © t. According to 
the definition of composition (see 2-5-3), g © t: X -> X is defined by 

(<j © t)(x) = ct(t(x)), for all xe X. 

Furthermore, as was shown in 2-5-3, g © t is surjective because both a and t 
are surjective, and a © t is injective because both g and t are injective. There¬ 
fore, <7 o t is a permutation of X —that is, a © t e S x , and S x is closed under 
the operation of composition, ©. 

It is now easy to see that {S x , o} is, in fact, a group. The associative law 
holds — for (j, t, p e S x we have g © (t © p) = (cr © t) o p because for every 
xe X, [cr © (t o p)](x) and [(cr © t) © p](x) both equal ct(t(p(x))). The identity 
map 1: X-> X defined by l(x) = x for every x e X is an identity for S x , 
because surely locr = (Tol=(Tfor every g e S x . Finally, given g e S x , be¬ 
cause g: X ^ X is 1-1 and onto there exists (according to 2-5-4) a mapping 
t : X —> X, called the 46 inverse map ” of cr, such that cr © t = 1 and t © <7 = 1 , and 
in addition t is 1-1 and onto. Thus, t is an element of S x and it is indeed an 
inverse of <7 under the operation ©. We have verified that {S x , ©} is a group 
called the group of permutations (or symmetries) of the set X. Of course, 
we write cr -1 (instead of t) for the inverse of g; after all, this is the way we 
normally write inverses in a multiplicative group. 

As a special case of the preceding, suppose the set X is finite—say, A" has n 
elements. Since the structure of the group S x is not affected by the nature of 
elements of X, there is nothing lost in taking 

X={1, 2, 

Then we write S n instead of S x and refer to S n as the symmetric group on n 
letters. An element <7 e S n is a permutation of the n integers 1, 2, ..., n and it 
is customary to describe <7 completely (as was done in 4-1-9 and 4-1-10) by 
writing each of the n integers with its image under <7 below it. For example, if 
g e <S 5 is the mapping for which: 

1 ->4, 2 -> 1, 3 -> 3, 4 -> 5, 


then we write 


/1 2 3 4 5\ 
* 14 1 3 5 iy 


5 -> 2 
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Obviously, there would be no harm in permuting the columns and writing 
a as 


/3 5 2 1 4\ 12 5 3 1 4\. 

\3 2 1 4 5/ ° r \1 2 3 4 5/ 


The identity element of S s is 



2 3 4 5\ 
2 3 4 5/ 


and the inverse of a is easily seen to be 


/4 1 3 5 2\/l 2 3 4 5\ 

\1 2 3 4 5} \2 5 3 1 4/ 


—that is, one merely interchanges the two rows of a. That this is the inverse 
depends on the fact that we already know how to multiply elements of S 5 —for 
example, if 


then 


/I 2 3 4 5\ 
\3 5 4 2 1/’ 


(7»I 


a 


2 


/I 2 3 4 5\ 

\3 2 5 1 4/’ 

/I 2 3 4 5\ 

\5 4 3 2 1/’ 



2 

3 

2 

1 


3 4 

4 1 

3 4 
2 5 


9 - 

0 - 


and, in particular, 

<7 o (7~ 1 = (7~ 1 o a = 


G 


2 3 4 5\ 
2 3 4 5/ 


= 1. 


More generally, for arbitrary n,a e S„ is written in the form 

/I 2 3 ••• n \ 

° \<rl <j2 <r3 ••• an}' 


The identity element of the group S„ is 


1 = 


(i 


2 3 
2 3 


!)• 


If, in addition, t e S n , then 



and we have 


2 3 

t2 t3 



/ 1 2 3 ••• n\ / 1 2 3 

\(7t1 ctt2 ctt3 ••• (7T«/’ \tct 1 Ta2 tct3 


TCTrt 
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As for the inverse, clearly 

/crl <j2 <t3 ••• an\ 

G “\1 2 3 ••• n\ 

Note that in the expression for cr _1 the top row consists of all the elements 
1, 2, ..., n except that they are not in the usual order—but the expression 
obviously denotes a mapping which is 1-1 and onto. 

The number of elements in the group S n is clearly n !; in fact, to determine 
a permutation a there are n choices for crl, and then n — 1 choices for <j 2, and 
so on—so there are indeed n ! choices for a e S n . In particular, S 3 has 3! = 6 
elements. Because the rigid motions of the equilateral triangle (see 4-1-9) can 
be viewed as permutations of the set {1, 2, 3}—that is, as elements of S 3 —and 
there are six of them, it follows (because the operation for both rigid motions 
and elements of S 3 is the same: namely, composition of mappings) that the 
group of rigid motions of the equilateral triangle is S 3 . 


4-1-12 / PROBLEMS 

1. In any group G, show 

(0 ab = b=> a = e. 

(ii) a 2 = a=>a = e. 

( Hi ) A left inverse of a given element is also a right inverse, and conversely. 

2. Let elements a 9 b , c in the group G be given. Then the equation 

axbcx — cabx 


has a unique solution for x. 

3. For elements a, b in the abelian group G, prove 

( ab) n = a n b n for all ne Z. 

Of course, the group G does not have to be abelian—it is only necessary 
that ab = ba . 

4. If a 2 — e for all a in the group G, then G is abelian. 

5. Suppose G is a group; show it is abelian o (ab) 2 — a 2 b 2 for all a, be G. 

6. In each case, decide which of the following properties of a group are 
satisfied: closure, associative law, identity, inverses, commutative law, 
cancellation laws. In particular, which of these are groups ? semigroups ? 

(/) { Z, o} where a° b = 0, 

(//) { Z, o} where a ° b = a — b 9 
(in) {Z, o} where aob = a + b+l 9 
(iv) { Z, o} where a°b = a 9 
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(v) { Z, o} where a°b = a + b + ab , 

(vi) the set of all even integers under addition, 

(vii) the set of all odd integers under addition, 

( viii ) the set of all even integers under multiplication, 

(ix) the set of all odd integers under multiplication, 

( x ) the set of all positive rationals under addition, 

(*0 the set of all positive rationals under multiplication, 

(xii) the set of all rationals a satisfying 0 < a < 1 under multiplication, 
(xiii) {G, °} where G is the set of all rationals with the single element 
— 1 excluded, and the operation is a ° b — a + b + ab, 

( xiv ) { Q, o} where a © b = ma x{a, b }, 

(xv) { R, o} where a o b =(ab)/ 2, 

(xvi) { R, o} where a o b = \a\ + |Z>|, 

(xvii) { R, °} where a © b = [a + b] (greatest integer function), 

( xviii ) { R, o} where a ° b == \]a 2 + b 2 , 

(x/x) all irrational real numbers under addition, 

(xxr) all irrational real numbers under multiplication, 

(xxi) {C, o} where a o = |a| \fi\ 9 

(xxii ) the elements 1, 5, 7, 11 of Z 12 under multiplication, 

(xxiii) the elements 0, 2, 5, 7 of Z 10 under addition, 

(xxiv) the elements 1, 3, 9 of Z 10 under multiplication, 

(xxv) The elements 3, 6, 9 of Z 12 under addition, 

( xxvi ) G = (0, 1, 2,..., m — 1} and the operation o is given by 

a o b = a + b, if a + b <m, 

a © b = r, iftf + Z>>ra and r = (« + Z>) — m, 

(xxvii) for a fixed prime p, G = {«///1 n e Z, r > 0} and the operation 
is addition, 

(xxviii) G = {a + b\jl | a, b e Q, not both 0} under multiplication, 

(xx/x) G = {« + 6^21 a, be Z} under addition, 

(xxx) G = (a + b e Q, not both 0} under multiplication, 

( xxxi ) for an arbitrary nonempty set X,G = Map(2f, X) and the opera¬ 
tion is composition of mappings. 

(xxxii) G = {( a , b) e R x R | not both 0} and the operation o is given by 
( a , b) o (c, rf) = (flc — bd, ad + Z>c), 

(xxxiii) the nonzero elements of an integral domain D under multiplica¬ 
tion, 

(xxxiv) Take R as the underlying set of a ring {i?, +, •} with operation o 
given by a o b = a + b + ab 9 
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7. (/) Do the four matrices 

*-(i ?)•_ ‘-('i ?)• »-G -?)• ‘-('i -5) 

in Ji{ Z, 2) constitute a group under multiplication ? Make a table. 
(ii) Do the same thing for the four matrices 

-(j ;)• «-(? -i)- «-(-i -?) 

8. Derive the multiplication table for the rigid motions of the square, as 
given in 4-1-10. Verify that these rigid motions form a group. Compare 
S 4 (the symmetric group on four letters) and the group of the square. 

9. How is the fact, 4-1-5, that a finite semigroup with cancellation is a group 
related to the fact, 3-2-11, that a finite integral domain is a field? How are 
the proofs related ? 

10. Why does the proof of 4-1-5 not go through when G is infinite? 

11. By defining an operation appropriately, can you make the following sets 
into a group ? If so make a table for each case where you have a group. 

(0 (°> 1}. (») {-1. +1}, (h'O {a,b}, 

(iv) {0, 1, 2}, (v) {0, +1, -1}, (vi) {a, b, c}. 

12. On the set G — {e 9 a , b, c } define an operation by the table 



e 

a 

b 

c 

e 

e 

a 

b 

c 

a 

a 

e 

c 

b 

b 

b 

b 

a 

e 

c 

c 

c 

e 

a 


Is this a group ? 

13. Show that if G — { e 9 a , b} is a 3-element group with identity e 9 then its 
multiplication table must take the form: 



e 

a 

b 

e 

e 

a 

b 

a 

a 

b 

e 

b 

b 

e 

a 


14. Show that if G = {e 9 a, b 9 c} is a 4-element group with identity e 9 then 
(except for possible renaming of the elements) its multiplication table 
must take one of the two forms: 



e 

a 

b 

c 


e 

a 

b 

c 

e 

e 

a 

b 

c 

e 

e 

a 

b 

c 

a 

a 

b 

c 

e 

a 

a 

e 

c 

b 

b 

b 

c 

e 

a 

b 

b 

c 

e 

a 

c 

c 

e 

a 

b 

c 

c 

b 

a 

e 
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15. For the following elements of S 1 


/I 2 3 4 5 6 7\ _ /I 2 3 4 5' 6 7\ 

\3 1 6 7 2 4 5/’ 5 2 1 4 7 3)’ 

/I 2 3 4 5 6 7\ 

P U 6 3 7 5 2 1/ 


compute 

0 ) * 7 , 

07 ) T 7 , 

0 «) p 7 , 

(iv) <7 1 

O’) T \ 

(w) p _1 , 

(vii) (jot, 

(viii) t o < 7 , 

OX) p o <7 O T, 

(x) %°<J° T -1 

(xi) po<jo p" 1 , 

(x/z) p 2 °(jop" 2 ) 

(xiii) (j -7 , 

(xiv) T -7 , 

(xd) p -7 , 

(xw) a 3 o p 2 o t -2 , 

(xvii) T 2 o P 2 o T 2 o P 2 , 

(xviii) G o T o (7 1 O T -1 


16. Show that the rigid motions of the rectangle (which is not a square) 
(such as the one shown in the accompanying diagram) form a group. 


1 i-1 4 


2 I-^3 

Make a table. How is this group (which is known as the four group or 
as Klein’s four group) related to the group of the square and to S 4 ? What 
about the symmetries of this rectangle ? 

17. The regular pentagon (as the one shown in the accompanying figure) has 
10 rigid motions. 


l 



List them, and show that they form a group. If you have plenty of time 
and energy make a table. Compare this group with S 5 . What about the 
symmetries of the regular pentagon ? 
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18. Treat the 12 rigid motions of the regular hexagon in the manner of 
Problem 17. 



19. Give an example of two elements a, b in a group, with (ab) 1 ^ a l b i . 

20. Prove: If a group has an even number of elements, then there exists an 
element not equal to e which is its own inverse. 

21. Prove: If G is a group, then: 

(0 {a -1 [ a e G) = G —that is, the set of all inverses of elements of G is G 
itself. 

(//) The mapping <j>: G -+G which maps a -> 0 _1 is 1-1 and onto. More¬ 
over, (j> is a homomorphism <^> G is abelian. 

22. If X is an arbitrary set, we know from 2-2-3 that Sf(X), the set of all 
subsets of X , can be made into a commutative ring with unity. What is the 
group of units of this ring? 

23. Consider {Sf(X), o} where © is given by 
*{i) A°B = Akj B, 

(ii) A o B = A n B, 

( Hi) Ao B = A — B = {xe A\x$ B), 

(iv) AoB = A + B = A'uB — A n B. 

In which cases do we have a group ? a semigroup ? 

24. If X is a set with more than two elements, show that S x , the group of 
permutations, is not abelian. 

25. Consider the rings: 

(0 Z 5 , (ii) Z® Z, (Hi) Z 3 ® Z 3 , 

(iv) Z 4 ® Z 4 , (v) Z 3 ® Z 4 , (vi) Z 3 ® Z, 

In each case, find the group of units and make a multiplication table for 
it. 
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26. If G l and G 2 are multiplicative groups, the product set G l x G 2 = 
{(a l9 a 2 ) \a 1 sG u a 2 eG 2 } can be made into a group by defining multipli¬ 
cation componentwise—that is, 

(a u a 2 )(b u b 2 ) = (a ± a l9 b 2 b 2 ). 

The result is called the direct product G x x G 2 . 

Generalize the direct product to any number of groups. When is the 
resulting group commutative ? 

27. The reals under addition, {*, + }, are a group, and so are the positive 
reals under multiplication {R> 0 , •}. Are these groups related? 

28. Show, by example, that if {G, •} is a semigroup in which (/) there is a 
left identity e , and (ii) every element has a right inverse, then {G, •} need 
not be a group. 


29. Prove that the following sets of complex numbers are groups under 
multiplication: 

(0 W A = { 1 , i, — 1 , — /}; these are the four 4th roots of unity. 


these are the three cube roots of unity. 

f 1 J3 -1 Ji -1 J3 . -1 JJ 1 

= |+1, —- these are the six 


six 6th roots of unity. 


(iv) W 6 


= | ±i> ±U (±i ±o|; 


these are the eight 8th roots of unity. 

(v) For any n > 1, W n — {z e C | z n = 1}; this is the set of all nth. roots 
of unity. 


30. (/) In ^#(R, 2), the ring of all 2 x 2 matrices with entries from R , exhibit 
a concrete matrix of form 

(c d ) w ^h a^0 9 b^0 9 c^0 9 d^0 

which has an inverse. Find its inverse. Can you determine all matrices in 
^#(R, 2) which have an inverse ? In other words, can you find the group 
of units of the ring ^#(R, 2) ? 

(ii) Do the same thing for the ring Ji( Z, 2). 
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4-2. Subgroups and Cosets 

4-2-1. Definition. A nonempty subset H of the group G is said to be a 
subgroup of G when, with respect to the operation induced from G, H becomes 
a group in its own right. In other words, H is a subgroup of G = {G, •} when 
{H, •} is itself a group. 

The definition of a subgroup is quite natural; it is essentially parallel to 
the definition of subring of a ring as given in 2-2-6. More generally, for any 
kind of algebraic system, the notion of a 66 subsystem ” wpuld be defined in the 
same fashion. 

To decide if a given subset of a group is indeed a subgroup we have the 
criteria given by our next result. These criteria are analogous to those for a 
subring, as discussed in 2-2-7. 

4-2-2. Proposition. Suppose H is a nonempty subset of the group G = 

{G, •}; then the following are equivalent: 

(/) H is a subroup of G. 

(ii) a,beH=>abeH and a~ l 2 e H\ that is, H is closed under both 
multiplication and taking of inverses. 

(iii) a, be H=>ab~ x e H . 


Proof\ (i) =>(«): Suppose H is a subgroup of G, then, obviously, a,beH 
=> ab-e H\ however, because a" 1 denotes the inverse of a as an element of the 
group G, it is not immediate that a e H=> a -1 e H. 

To see this, it surely suffices to prove that for any subgroup H of G we 
have: 

(1) Its identity (denote it temporarily by e') is the same as the identity e of 

G. 

(2) For ae H, its inverse in the group H (denote it temporarily by a') is 
the same as its inverse a~ l in the group G. 

Now, from 


e e — e' = ee’ 

we obtain by cancellation in G, e* — e —so (1) holds. Then 

ad — e* = e — aa~ i 

so by cancellation in G, d = a ~ l —so (2) holds. This completes the proof that 
(/)=>(«)• 
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(«)=>(«/): By hypothesis, H is closed under multiplication and taking of 
inverses, Therefore, 

a,beH=>a,b~ l e H=> ab~ l e H 

so (///) holds. 

(«/)=>(/): For any two elements a,beH we know, by hypothesis, that 
ab~ { e H. Apply this first to the special case where the elements a and b of H 
are equal; the conclusion is, aa~ x e H —that is, the identity e belongs to //, or 
to put it another way, H has an identity. Next apply it to e e H and any a e H\ 
the conclusion is, ea~ x e H —that is, the inverse of an element of H belongs 
to //, or to put it another way, every element of H has an inverse in H. 
Furthermore, for a, b e H we now have a, b~ l e H and consequently 
a(b~ 1 )- 1 e //; thus ab e //, and we conclude that //is closed under multiplica¬ 
tion. Since the associative law is automatic for elements of //, we have proved 
that H is a group—hence, a subgroup of G. | 

The preceding result is stated for a multiplicative group. When the group 
operation is addition, +, the criteria that //be a subgroup take the equivalent 
forms 


a,beH=>a + beH and — aeH, 
or 

a, b e H=> a — b e //; that is, // is closed under subtraction. 

In a rather special situation (which will occur frequently in our discussion) 
it is possible to give an even simpler criterion for a subgroup—namely, 

4-2-3. Proposition If the finite subset H of the group G = (G, •} is closed 
under the operation, then H is a subgroup of G. 


Proof : Since H is closed under the operation, //= {//, •} is clearly a 
semigroup. Moreover, the cancellation laws hold in G, so they surely hold in 
H. Thus, H is a finite (in particular, finite implies nonempty) semigroup that 
satisfies both cancellation laws. Hence, according to 4-1-5, {//, •} is a group— 
that is, H is a subgroup of G. 

There is also something to be gained by giving a more pedestrian proof 
of this result. Since H is closed under the operation, it suffices to show 
a e //=>« _1 e H. We do this in a concrete fashion which indicates how one 
can actually compute a" 1 . 

Suppose a e H. If a = e, then obviously a~ x e //, and we are finished. So 
suppose a ^ e. Now, consider the sequence of powers of a 

a, a 2 , a 3 , a A ,.... (*) 
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Because H is closed under multiplication all these elements belong to H. But 
H has only a finite number of elements, so not all these powers of a can be 
distinct. Thus, there exist distinct positive integers i and j (say i < j ) for which 
a 1 — a j . (We have just used an obvious but extremely important principle 
known as the pigeon-hole principle. It may be stated formally as follows: If 
more than n objects are placed in, or distributed among, n boxes—or pigeon 
holes—then there exists a box that contains two, or .more, objects.) Then, 
multiplying, in G, by a~ l we obtain 

a J ~ l = e, j — i> 1. 

Consequently, e appears in the sequence (*) of powers of a. Let m be the 
smallest positive power of a for which 

a m = e. 

Since a ^ e, we know that m > 1; thus m — 1 > 1 and cf 1 ' 1 e H. From 

a • a m ~ i = a m = e 

we conclude that a -1 = a w_1 e H— so H is a subgroup. 

(In particular, one way to locate the inverse of a is to compute the sequence 
of powers of a ; upon arriving at the first power of a which equals e , one takes 
the preceding power of a.) | 

If G is any group, then G itself is a subgroup and so is (e) (the set, or sub¬ 
group, consisting of the identity alone). These subgroups provide no additional 
information about the group, and they are said to be trivial. We shall now 
turn to a number of examples of nontrivial subgroups. 

4-2-4. Example. Consider { Z, +}, the group of integers under addition. 
The set 2 Z of all even integers is obviously a subgroup. More generally, for 
any m > 0, m Z is a subgroup of the additive group of integers, since it is 
closed under subtraction—that is, if ma u ma 2 emZ, then ma 1 —ma 2 = 
m(a i — a 2 ) e m Z. 

Furthermore, we can show that any subgroup of { Z, + } is of form m Z. To 
do this, consider an arbitrary subgroup H. If H — (0), then for m = 0 we have 
H = m Z; so suppose H # (0). There exists an element 0 in H. If a is 
negative, then its inverse — a is positive and belongs to H . Thus H contains a 
positive integer. Let m be the smallest positive integer belonging to H . 
Because H is a subgroup, 2 m = m + m belongs to //, and so do 3m = 2m + m, 
4m = 3m + m,...; of course, the inverses — m, —2m, —3m, —4m,... also 
belong to H. In particular, 


mZcz H, m > 1. 
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If m = 1, then m Z = Z, and we have H = Z = 1 Z. Suppose, therefore, that 
m > 1, and let 0 be an arbitrary element of H. Applying the division algorithm 
to a and m , there exist integers q and r for which 

a = qm + r, 0 < r < m. 

Now, qm = mq ewZci/,so because H is a group 

r = a — qm e H. 

Because m is the smallest positive integer belonging to //, it follows that r — 0. 
Thus, a — qm —so any element a of H belongs tomZ and we have H c= m Z. 
The conclusion is 

H = m Z. 

We have proved: The set of all subgroups of { Z, + } is given by {m Z | m > 0}. 

Incidentally, it may be noted that the subgroups m Z are all distinct—in 
other words, for m u m 2 , both greater than or equal to 0, we have 

m 1 Z = m 2 Z o m 1 —m 2 . 

The details are left to the reader. 

4-2-5. Example. Consider {Z 12 , +}, the additive group of the ring 
Z 12 . Is the set S = {0, 3, 6, 9} a subgroup? According to 4-2-3, because S is 
finite, it suffices to check if S is closed under addition. This is a straightfor¬ 
ward matter; the work may be organized into an addition table for the ele- 



All the entries in the table are from S —in other words, S is closed under 
addition, and S is indeed a subgroup. 

In the same way, the sets {0, 6}, {0, 4, 8}, and {0, 2, 4, 6, 8, 10} are sub¬ 
groups of { Z 12 , +}. 

On the other hand, the set T = {0, 3, 5, 8, 11} is not a subgroup because 
it is not closed under addition. In particular, not all entries of the addition 
table for T belong to T —for example, 5 + 8=1, which is not in T. 

As a matter of fact, it is not hard to see that except for the trivial subgroups 
(0) and Z 12 , we have already listed all the subgroups of { Z 12 , +}. (One way 
to do this is to imitate the method used in 4-2-4 to determine all subgroups of 
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the additive group of integers.) More generally, in the next section we shall 
learn, among other things, how to find all subgroups of { Z w , +} for any 
m > 1. 

We may also consider { Z* 2 , *}, the group of units of the ring Z 12 . This 
is a group whose underlying set consists of the four elements 1, 5, 7, 11 of 
Z 12 . It is clear that {1, 5} is a subgroup, as are {1, 7} and {1, 11}—and, except 
for the trivial subgroups (1) and Z* 2 , there are no others. 

In the same spirit, consider { Z 13 , + }, the additive group of the ring Z 13 . 
Is the subset S = {0, 3, 6, 9, 12} a subgroup? Because S is finite, only closure 
needs to be checked. Since 6 + 9 = 2 in Z 13 , we see that closure fails for S , 
so it is not a subgroup. As a matter of fact, the group { Z 13 , +} has no sub¬ 
groups except the trivial ones. The reader may convince himself of this fact 
directly, but should he prefer to wait then he will see it fall out as a con¬ 
sequence of a general result later in this section (see 4-2-17). 

We can also study { Z* 3 , •} the group of units of the ring Z 13 . This is a 
group whose underlying set may be taken as the twelve elements 1, 2, 3, 4,..., 
10, 11, 12 of Z 13 . Is the subset {1, 5, 8, 12} a subgroup? Constructing the 
multiplication table for this set, we have 


• 1 

1 

5 

8 

12 

1 

1 

5 

8 

12 

5 

5 

12 

1 

8 

8 

8 

1 

12 

5 

12 

12 

8 

5 

1 


so this finite set is closed under multiplication and is, therefore, a subgroup of 

It is straightforward to check that among the nontrivial subgroups of 
{ Z* 3 , •} we also have 

{1,12}, {1,3,9}, {1,3,4,9,10,12}. 

In fact, it is not hard to see that we have already listed all the nontrivial sub¬ 
groups of { Z* 3 , •}. This fact will also appear as an immediate consequence of 
general considerations in the next section. More generally, we shall study the 
structure of the group { Z*, •}, for any m> 1, in Section 4-6. 

4-2-6. Example. Consider the symmetric group on three letters, S 3 , which 
may also be viewed as the group of rigid motions (or symmetries) of the 
equilateral triangle—see 4-1-9 and 4-1-11. The six elements of S 3 may be 
denoted as 
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Naturally, the multiplication table for this group is the one already given in 
4-1-9. 

Obviously, the following are subgroups of G = S 3 : 

M, H t = {e, Ti}, H 2 = {e, t 2 }, H 3 = {e, t 3 }, A = {e, <j u a 2 j, S 3 . 

Furthermore, there are no other subgroups; this may be seen in the most 
elementary manner. In more detail, suppose H is a subgroup. If H ^ ( e ), then 
H contains one (or more) of the elements r lf r 2 , t 3 , cr l9 <j 2 . Taking powers 
of this element, H surely contains one of the subgroups H u H 2 , H 3 , A. If 
H is not one of these subgroups, choose an element of H not in the subgroup, 
then by repeatedly taking powers of this element and multiplying these with 
elements of the subgroup, it follows that H = S 3 . Thus, there are no additional 
subgroups. 

Incidentally, each of the nontrivial subgroups has a geometric interpreta¬ 
tion in terms of rigid motions of the equilateral triangle. In particular, the 
subgroup A consists precisely of the three rotations about the axis perpendicu¬ 
lar to the plane of the equilateral triangle and passing through its “center”; 
while, for each i = 1, 2, 3, the subgroup H t consists of all rigid motions which 
leave the vertex i fixed. 

Let us consider next the octic group (otherwise known as the group of 
the square) which was described in 4-1-10. It consists of the eight rigid 
motions of the square, (shown in the accompanying figure), 


l 


2 


4 


and its elements are denoted by 

/I 2 3 4\ /I 

e- \l 2 3 4/’ ai \2 

/I 2 3 4\ (l 

\4 1 2 3/’ Tl \4 


2 3 4\ 

3 4 ir 

2 3 4\ 

3 2 \)' 




2 1 4/ 


2 3 4\ 
4 1 2/’ 
2 3 4\ 

1 4 3 )’ 


Obviously, the octic group is a subgroup of S 4 . 

From the multiplication table for the octic group (see 4-1-10) it is straight¬ 
forward to check that each of the following finite subsets of the octic group 
is closed under multiplication: 
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{e,a 2 }> {e 9 r 2 } 9 {e>Pi}> {e, P 2 )> 

{ e 9 u l9 <j 2 9 g 3 } 9 {e 9 Pi, p 2 9 cr 2 j, {e 9 t 15 t 2 , n 2 } 

so it is a subgroup of the octic group. We shall leave it for the reader to investi¬ 
gate the geometric interpretation of these subgroups and to decide if there are 
any other nontrivial subgroups of the octic group. 

4-2-7. Example. Let G be an arbitrary multiplicative group and consider 
the subset 

3 = {a e G | ax = xa for all x e G}. 

In words, 3 consists of those elements of G which “commute” with every 
element of G. We show 3, which is known as the center of G 9 is a subgroup. 

First of all, 3 # 0—in fact, e e 3 since ex = x = xe for all x e G. It re¬ 
mains to verify that 3 is closed under multiplication and taking of inverses— 
but these are easy. If a 9 b e 3, then ax = xa and bx = xb for all xe G and, 
therefore, 

(ab)x = a(bx) = a(xb) = (ax)b = (xa)b = x(ab) 
for all x e G —so 9 ab e 3. Furthermore, if a e 3, then for every x eG we have 

ax — xa=> a~ 1 (ax) = a~ i (xa) 

=> x = a~ l xa 
=>xa~ l = (a~ 1 xa)a~ i 
=>xa~ l = (a~ i xa)a~ 1 
=> xa~ l = a~ l x 

which says that a~ l e 3. Thus, 3 is indeed a subgroup. 

Obviously, if the group G is commutative, then its center 3 equals G 9 and 
conversely—that is, 

G is abelian o 3 = G, 

4-2-8. Example. As noted in 4-1-8, 

W= {zeC\\z\ =1} 

is an abelian group under multiplication. It is known as the “circle group” 
because its elements are all complex numbers which lie on the “ unit circle.” 
Obviously, W is a subgroup of C*, the multiplicative group of units (or non¬ 
zero elements) of the field C. 

Now, for each positive integer n 9 consider the set 

W„ = {zeC\z”=l} 

—in words, W n consists of all complex numbers whose rath power is 1, so we 
refer to W n as the set of “«th roots of unity.” The set W n is nonempty, since 
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the element 1 surely belongs to W n . Even more, the set W n is finite; in fact, 
for z g C 

zeW n o z n - 1=0 

o z is a root of the polynomial f(x) = jc" — 1 e C[x] 

—but, according to 3-5-5, a polynomial of degree n over a field can have at 
most n distinct roots in the field—so #(W n ) <n. 

Of still greater interest is the fact that W n is a subgroup of W. In the first 
place, W n a W; in fact, 

zg W n =>z n = 1 => |z”| =|1| 

=Hz|"=l 

and, since |z| is a nonnegative real number, this implies zeW. Then, because 
W n is a nonempty finite subset of W , to prove it is a subgroup of W it suffices 
to verify that W n is closed under multiplication. But this is trivial: 

z l9 z 2 g W„ => z^ = 1, z 2 " = 1 

=> (Z!Z 2 )" = 1 

=>z t z 2 e W n . 

In particular, we have shown that W n is a group with at most n elements. 

The precise number of elements in W n (that is, the number of nth roots of 
unity) is not of critical importance for us, but we digress to show the reader 
with some trigonometric experience why this number is n. 

Consider an arbitrary point (that is, complex number) on the unit circle 
W. Drawing the radius (whose length is 1) from the origin to this point, and 
with the angle 9 as indicated in the accompanying diagram, the coordinates 
of the point (in the X-Y plane) are clearly (cos 0, sin 0). (This is, essentially, 
the definition of sin 0 and cos 0 for an arbitrary angle 0.) 
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In other words, we are dealing with the complex number 

cos 9 + i sin 9. 

Conversely, any complex number of form cos 0 + i sin 0 is on the unit circle 
(because its distance from the origin is >/cos 2 Q + sin 2 9 = 1). If we have two 
elements z u z 2 e W and write them in the form 

z 1 = cos 0 X + i sin 0 l5 z 2 = cos 0 2 + i sin 0 2 , 

then, in virtue of standard trigonometric identities, the computation of the 
product of these complex numbers yields 

z l z 2 = 008 ( 0 ! +0 2 ) + i sin(0! +0 2 ) (*) 

In other words, multiplication of two elements of the group W amounts to 
adding their angles—as illustrated in the accompanying picture. Of course, 



the identity is 1 = 1 + 0/ = cos 0 + i sin 0; that is, the identity is the complex 
number associated with the angle 0 = 0. Also, for an arbitrary element 
z = cos 0 + i sin 0 of W its inverse z _1 , is clearly 

cos(—0) + i sin(—0) 


since 

(cos 0 + / sin 0)(cos(—0) + i sin(—0)) = cos(0 — 0) + i sin(0 — 0) = 1. 
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Now, let us fix an integer n > 1 and analyze the group W n of nth roots of 
unity. We consider the angle a = 360 /a, or in radian measure a = 27r/«,andput 

£ = cos a + i sin a. 

Of course, ^eW. According to (*), we have 

C 2 = cos(2a) + i sin(2a) 
and inductively, for any r > 1, 

c r = cos(ra) + i sin(ra). 

In addition, C° = 1 = cos(Oa) + i sin(Oa). Obviously, each of the complex 
numbers 

l = C°,C, C 2 , C "" 1 

lies on the unit circle. Since 

C" = cos(«a) + i sin(«a) 

= cos( 27 t) + i sin(27r) 

= 1 , 

C is an «th root of unity; hence, any power of f is also an nth root of unity and, 
in particular, 1, £, ...,C" -1 all belong to Moreover, these n complex 
numbers are distinct—for they obviously represent the n equally spaced points 
on the unit circle, going counterclockwise and starting from (1,0). 
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Thus, the group W n has n elements and ail of them are powers of the element 
f = cos a + i sin a, where a = Injn —namely, 

w n = { 1 = c°, C, C 2 ,..., C"" 1 } = {£, C 2 ,..., C’ 1 , C" = l}. 


4-2-9. Proposition. The intersection of two subgroups of a given group is 
a subgroup. More generally, if {H a } is any collection of subgroups of the 
group G, then their intersection, P) a H a , is a subgroup of G. 


Proof : For convenience, let us write H = f) a H a . Because the identity e 
of G belongs to every subgroup of G , and, in particular, to every H a it follows 
that e e H —so // is nonempty. To show H is a subgroup, we verify the condi¬ 
tion given in 4-2-2, part (iii ); namely, 

a, be H=> &,beH a for every a 

=>ab _1 e H a for every a (because H a is a subgroup) 


4-2-10. Example. Consider the group S x , introduced in 4-1-11, of all 
permutations of a given set X. Fix an element y of X , and let 

H y ={oeS x \<ry = y}. 

In words, H y is the set of all permutations of X (that is, mappings of X -► X 
which are 1-1 and onto) which map the element y to itself (that is, which keep 
y fixed). Is H a a subgroup of S x ? 

In the first place, H y is nonempty—the identity element 1 of S x (which is 
the identity map 1: X -> X) obviously belongs to H y . In addition, H y is closed 
under the operation; for if (T, t e H y , then 

(<7 o x )y = <t(t0 >)) = ay = y 

so (jot e H y , Finally, H y is closed under taking of inverses; for if a e H y9 
then <ry=y, and applying cr -1 gives cr -1 (cry) = a~ i (y), so v~ i y = y and 
(j _1 g H y . Thus, H y is a subgroup of S x . 

More generally, suppose Y = {y u y 2 ,..., y n ) is a finite subset of X , and 
let H y denote the set of all permutations a e S x which leave Y pointwise 
fixed (that is, oy t = y t for i — 1,2,..., n). Then H y is a subgroup of S x , as 
the reader can easily verify in a straightforward manner. Another, more 
interesting, way to see this is to consider the intersection of subgroups 
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According to 4-2-9, this intersection is a subgroup of S x . Moreover, it is 
immediate from the definitions (as the reader may convince himself) that 

Hy 1 ^ n * * * n Hy n = Hy • 

Obviously, the same situation applies when Y is an arbitrary subset of X. 

Having studied several examples of subgroups, our next objective is to 
see how a group can be decomposed with respect to a subgroup. 

4-2-11. Definition. Suppose H is a subgroup of the group G . For any 
a e G 9 let us write 

aH — {ah | he H} 

(in words, aH is the subset of G consisting of all elements of form ah , as h 
runs over H) and refer to aH as the left coset of H in G determined by a. 

Similarly, we may write 

Ha = {ha\ he H} 

and refer to it as the right coset of H in G determined by a. 

No one will be surprised at the assertion that there is a complete parallel¬ 
ism between results about left cosets and results about right cosets. Because of 
this, we shall deal almost exclusively with left cosets and refer to them (some¬ 
what carelessly) simply as cosets. It will usually be left to the reader to convince 
himself that each statement made about cosets (meaning: left cosets) has an 
analog for right cosets. 

The definition of a coset was stated for the situation where the group opera¬ 
tion is multiplication. When the group operation is addition, a coset should 
obviously be written in the form 

a + H= {a + h\heH}. 

4-2-12. Examples. (1) Consider the group G = S 3 = {e, a l9 <j 2 , z l9 t 2 , t 3 } 
(with notation as in 4-1-9, 4-1-11, 4-2-6) and the subgroup Hy = {e 9 Zy}. Each 
element of G determines a (left) coset of Hy in G. Making use of the multipli¬ 
cation table for S 3 , and dropping the ° symbol for multiplication, the cosets 
are seen to be 

eH 1 = {ee 9 e tJ = {e 9 tJ, 

a^H x = {a^e 9 < 7 ^} = {o l9 t 3 }, 

<t 2 H y = {<j 2 e 9 o 2 tJ = (<j 2 , t 2 }, 

= {*1*, Wi) = {?i> *}> 

T 2 Hi = ( T 2 e 9 t 2 T l) = ( T 2 > ^ 2)9 
r 3 Hy = (t 3 e, r 3 Zy} = (t 3 , Gy}. 
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Note that each coset contains two elements (because the subgroup H i has two 
elements) and that some of the cosets are equal (as sets)—namely, 

cH x == '^i.H'i = H X9 ^ \H i = Tj (J 2 == ^2 

(The order in which the elements of a set are listed does not matter.) 

The right cosets of H i in G are as follows. 

H t e = {ee 9 T x e} = {e, t x } 9 

H x g x = {ea u = {o u t 2 }, 

H 1 g 2 = {ea 2 , T t G 2 } = {<j 2 , t 3 }, 

Hi? i = {eT^TiTj = {t^e}, 

^l T 2 = { eT 2 > T 1 T 2 ) = { T 2 > °l}> 

^1 T 3 = i eT 3 5 ^ 1 ^ 3 } = { T 3 > 

and we have 

H x e — H x t x = 7/ 1 , H x g x = H x t 2 , HiG 2 — Hi t 3 . 

Note that the right cosets are not the same as the left cosets. 

If we take the subgroup H 2 = {e, t 2 }, then its left cosets in G — S 3 are 

eH 2 = {e, t 2 }, G t H 2 = {g u tJ, g 2 H 2 = {(j 2 , t 3 }, 

x x H 2 = {r l5 dj, t 2 H 2 = {t 2 , e}, t 3 H 2 = {t 3 , <7 3 }, 

and its right cosets in G — S 3 are 

H 2 e = {e, t 2 }, H 2 g x = {a 1? t 3 }, H 2 g 2 = {(J 2 , tJ, 

H 2 T 1 = { T l> 02 }> H 2 T 2 = {t 2 , £}, H 2 T 3 = {t 3 , Cq}. 

Here too the left and right cosets are not the same. 

Finally, let us consider the subgroup ^4 = {e, < 7 ^ <j 2 } of G = »S 3 . The left 
cosets of A are 

^ = { e , (Ji, (j 2 ), (7^ = {(Ji, (j 2 , e}, g 2 A = {g 2 , e, aj, 

M = {t 1? T 2 , T 3 }, r 2 A = {t 2 , 13 ,^}, T 3 A — {t 3 , Tjl, T 2 }, 

and the right cosets of A are 

^ = {e 9 G U G 2 } 9 Ag x = {G U G 2 , £?}, AG 2 = {(7 2 , e, Gi) 9 
ATi = {t 15 T 3 , T 2 }. = {t 2 , T u T 3 }, ^T 3 = {t 3 , T 2 , tj. 

In this situation, the left and right cosets are the same—in the sense that every 
left coset is a right coset and conversely; in fact, 

eA = g x A = g 2 A = Ae = Ag x = Ag 2 = A, 
t t A = t 2 A = z 3 A = At 1 = At 2 = ^4t 3 . 
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(2) Consider the multiplicative group G — Z* 3 = {1, 2,..., 11, 12} and 
the subgroup H — {1, 5, 8, 12} which was mentioned in 4-2-5. Because G is 
commutative, there is no distinction between left and right cosets. Some 
typical cosets of H in G are 

SH = {8 • 1, 8 • 5, 8 • 8, 8 • 12} = {8, 1, 12, 5}, 

1H= {7 • 1, 7 • 5, 7 • 8, 7 • 12} = {7, 9, 4, 6}, 

3H= {3 • 1, 3 • 5, 3 • 8, 3 • 12} = {3, 2, 11, 10}. 

(3) Consider G = Z, the group of integers under addition, and the sub¬ 

group H=1Z. Because the operation is addition, the coset of H in G deter¬ 
mined by the element 5, say, takes the form 

5 + 7Z = {5 + 7/|/e Z}. 

But, lo and behold, this coset is precisely what we have heretofore referred to 
as the residue (or congruence) class |_5j 7 . Obviously, then, for any a e Z, the 
coset it determines is 


a + 7 Z = {a + lt\ t e Z} =\a | 7 . 

More generally, as seen in 4-2-4, an arbitrary nontrivial subgroup of Z 
is of form m Z for some integer m > 1. Then, for any a e Z, the coset of m Z 
in Z which it determines is 

a + mZ = {a + mt\t e Z}= |_^J m . 

This shows, in particular, that residue classes (mod m) are really very special 
cases of cosets—namely, a residue class (mod m) is a coset when we are dealing 
with the additive group Z and the subgroup m Z. 

In virtue of this, it is natural to inquire if the properties of residue classes 
(mod m) (see 1-7-10 and 1-7-12) have analogs for cosets in general. They 
do—in fact, we have 


4-2-13. Properties. Suppose H is a subgroup of the group G\ then for 
any a, b e G we have: 

(1) H itself is a coset (of H in G). 

Proof : Since eH = {eh\he H} = {h\he H} = H, the coset determined 
by e is H itself. | 

(2) a e aH. In words, every element of G belongs to the coset which it 
determines; in particular, every element of G belongs to some coset, 
and the cosets cover G. 

Proof : Since e e //, we have a — aee aH. | 
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(3) b e aH o ae bH, In words, b belongs to the coset determined by a 
if and only if a belongs to the coset determined by b. 

Proof : Because the roles of a and b are interchangeable, it suffices to prove 
the implication =>. If b e aH , then b = ah for some he H. Since H is a sub¬ 
group, we have h~ l e H —so a = bh~ l e bH, | 

(4) b e aH=> bH = aH, In other words, if an element belongs to a coset, 
then it determines that coset. 

Proof : First, let us show: b e aH=>bHa aH, In fact, given b e aH , it is 
of form b = ah' for some h' e H, Then any element of bH is of form bh = ah'h 
with he H. But h'h e H because H is a group—so bh = ah'h e aH , and indeed 
bha aH, 

In virtue of (3) above, and the foregoing (with the roles of a and b inter¬ 
changed) we have 

b e aH=>ae bH=>aH a bH, 

Combining this with the foregoing gives b e aH=> bH= aH, | 

(5) aH n bH # 0 => aH — bH, In other words, if two cosets meet they 
are the same coset—so two cosets are disjoint sets or they are identical. 

Proof : If c e aH n bH , then cH =aH and cH = bH. | 


An immediate consequence of these properties of cosets is the following: 

4-2-14. Proposition. If H is any subgroup of the group G, then G decom¬ 
poses as a union of disjoint cosets of H in G, In particular, an element of 
G belongs to exactly one coset. 

Proof: There is nothing to prove; we merely point out that if the cosets 
aH and bH are equal, aH = bH, then they are counted as a single coset—this 
permits us to talk about the union of “disjoint cosets.” | 

4-2-15. Remark. The preceding discussion of cosets can be understood 
better in terms of the general framework of an equivalence relation (see 
3-4-3). As a point of departure, we observe a simple criterion for the elements 
a, be G to belong to the same coset (that is, for aH to equal bH )—namely, by 
making use of 4-2-13, part (3), 

aH = bH o ae bH o b~ x a e H, 

Now, let us define the relation a = b (mod H) (which one reads in the obvious 
way) on G by 

a = b (mod H) o b~ l a e H, 
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This is an equivalence relation. In fact: a = fl(mod //), since a~ l a = eeH; 
a = b (mod H)=>b = a (mod H); since b~ l a e H=> a~ l b = (b~ 1 a)~ 1 e H\ 
a =b (mod H) and b = c (mod H) imply a = c (mod //), since b~ 1 ae H and 
c~ 1 b e H imply c~ l a = (c~ i b)(b~ i a) e H. 

According to 3-4-3, the equivalence classes with respect to this equivalence 
relation provide a disjoint decomposition of G. Denoting the equivalence 
class to which a belongs by |_fJ H , we recall that it is defined by 

\a\ H = {b e G\b == a (mod H)}. 

What is this subset [a_\ H of G? Since 

b e | a | g o b == a (mod H) 
o a~ 1 b eH 
o be aH , 

we see that \_a] H = aH. Thus, cosets are equivalence classes, and the coset 
decomposition of G with respect to the subgroup H is really the decomposition 
of G into equivalence classes under the equivalence relation a = b (mod H ). 


4-2-16. Theorem. (Lagrange). Suppose H is a subgroup of the group G. 
Denote the number of elements (which may be oo) of G by # (G), and call 
it the order of G. Denote the number of left cosets (which may be oo) of 
H in G by (G : //), and call it the index of H in G. 

If the group G is finite, then 

#(G) = (G : H) #(H). 

In particular, if G is finite the order of any subgroup divides the order of G. 

Proof: We observe that any left coset, aH , of H in G has the same number 
of elements as H. In fact, the mapping h -► ah of 

H^aH 

is onto (by definition of aH) and 1-1 (since, by cancellation, ah t = ah 2 
implies h t = h 2 )l so #(aH) = #(//). 

Therefore, G is a union of disjoint cosets (there are (G : H) of them), each 
of which consists of #(#) elements—so #(G) = (G : H) #(//). 

We have proved: For any subgroup H of the finite group G, the order of 
G equals the order of H times the index of H in G. | 

4-2-17. Corollary. If the group G has a prime number of elements, then 
it has no subgroups except the trivial ones. 
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Proof : Suppose #{G ) = p. If H is a subgroup of G, then #(//) divides 
#(G), so #(//) is 1 or p, Consequently, the only possibilities are H — ( e ) or 
H= G, | 

This tells us, for example, that the group {Z 13 , +} (which has 13 ele¬ 
ments) has no nontrivial subgroups—an assertion which was made in 4-2-5. 


4-2-18 / PROBLEMS 

1. Suppose H is a subgroup of G and F is a nonempty subset of H ; show 
that jF is a subgroup of H o F is a subgroup of G. 

2. In the additive group of integers, { Z, +}, which of the following subsets 
are subgroups: 

(0 6Zn 12Z, (n) 6Z u 12Z, 

{in) 6Z n 10 Z, (iv) 6Zu 10 Z, 

00 all integers divisible by 3 or 5, 

(vi) all integers divisible by 3 and 5. 

For those that are subgroups, express them in the form m Z. (According 
to 4-2-4, we know that every subgroup of { Z, +} is of form m Z.) 

3. Verify that the following are subgroups of { Z 15 , +}: 

0) {0, 5, 10}, 07) {0, 3, 6, 9, 12}. 

Find other subgroups, if you can. 

4. The group { Z* 5 , •} consists of the eight elements 1, 2, 4, 7, 8, 11, 13, 14. 
Which of the following subsets are subgroups? 

(0 {1,2, 4, 8}, 07) {1,2,4,8,11,13}, 

077) {1,4,7,13}, (iv) {1,7,14}. 

Find additional subgroups, if you can. 

5. 0) Show that if {i?, +, •} is a ring and S is a subring, then S is a sub¬ 

group of {R, +}, the additive group of R. 

07) On the other hand, a subgroup S of {R, +} need not be a subring. 

Prove this by exhibiting an example. 

077) Give an example of a ring {/?, +, •} for which every subgroup of 
{R, + is} also a subring. 

6. Show that in 4-2-3 the hypothesis of finiteness for the subset H cannot be 
dropped. More precisely, give an example of an infinite subset H of a 
group G such that H is closed under the operation but is not a subgroup. 
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7. Find as many subgroups of { Z m , + } as you can for each of the following 
values of m: 

(i) 5, (ii) 6, (iii) 7, 

(. to ) 10, (v) 14, (vi) 18. 

Have you found them all ? 

8. Do Problem 7 for the groups { Z*, •}. 

9. If m is a nonnegative integer, then 

mZ = (0) o m — 0 and mZ = Z o m = 1. 

More generally, if m and n are nonnegative integers, then 

mZc nZ o n\m 
(we permit 010) and consequently 

mZ = nZ o m — n. 


Thus, the subgroups mZ (m>0) are all distinct. 

10. The group S 4 has 24 elements; find a subgroup with the following number 
of elements 

(0 2, (ii) 3, (iii) 4, (iv) 6, (u) 8, (vi) 12. 

11. Prove: If H is a subgroup of G , then 

H= {xeG\xH<= //}. 

12. Suppose H 1 c: H 2 <= H 3 c= • • • is an ascending sequence (which may be 
finite of infinite) of subgroups of a given group G. If we put 

i 

then H is a subgroup of G. 

13. In a group G, the union of two subgroups is a subgroup if and only if one 
of them is contained in the other. 


14. Suppose we are given an arbitrary set X and an abelian group {G, +}. 
Show how Map(X, G) can be made into an abelian group. What happens 
if the operation in G is multiplication ? and if G is not abelian ? 


15. For an arbitrary element a in the group G, define a mapping L a : G-*G by 
putting 


L a (x) = ax , for all xe G. 


Then the map L a , which is known as 66 left multiplication by a,” is 1-1 and 
onto; in other words, L a is a permutation of the set G— L a e S G . In the 
same way, “ right multiplication by a ” is also an element of S G . 



4-2. SUBGROUPS AND COSETS 


419 


16. For each element a in the group { Z*, •}, find its inverse; express it as a 
power of a , if you can. Do the same thing for the multiplicative group: 

(0 zj, («) z?o, («0 z? lf (fc) z* 2 , 

00 Z* 3 , (w) Z* 4 , (vii) Z* 5 , (wii) Z* 7 , 

17. For which elements of { Z* 3 , •} can you find a nontrivial subgroup con¬ 

taining them? 

18. (/) Show that in the group of the square (see 4-1-10) the element a 2 

belongs to the center, and t 2 does not. Find the center. 

(//) The center of S 3 is (< e ). 

(in) Find the center of S 4 . What about the center of S x for an arbitrary 
set XI 

19. Consider an arbitrary set X and a nonempty subset Y. Is {a e S x | o Y = Y } 
a subgroup of S X 1 Is {a e S x | aY cz F} a subgroup of S X 1 

20. (0 Prove: For W n , the group of complex nth roots of unity (see 4-2-8), 
we have (with m > 1, n > 1) 

W m <=. W n o n\m. 

(ii) Show that the set 

OO 

W x ={jw„ 

n= 1 

is a subgroup of the unit circle W. It is known as the group of all roots of 
unity. 

21. List all left and right cosets for the subgroup H 3 = { e , t 3 } of S 3 . [Nota¬ 
tion as in 4-2-12, part (1)]. 

22. List all cosets of H in { Z 12 , +} when H is the subgroup: 

(0 {0, 4, 8}, (ii) {0, 3, 6, 9}, (Hi) {0, 2, 4, 6, 8, 10}. 

23. List all cosets of H in { Z* 3 , •} when H is the subgroup: 

(0 {1, 12}, (ii) {1, 3, 9}, ( iii ) {1, 3, 4, 9, 10, 12}. 

24. Let G be the group of the square, as discussed in 4-2-6. Find all cosets 
(left and right) of H in G when H is the subgroup 

(0 {e, o 2 ), (ii) {e, t 2 }, (iii) {e, p t }, 

(iv) {e, p u p 2 , <J 2 }, (f) {e, r 1 , r 2 , a 2 }. 

25. Consider the subgroup H y of S x (notation as in 4-2-10). For peS x , 
characterize the left coset to which p belongs; more precisely, show that 

pH y ={teS x \ty = py). 

Can you say anything about the right coset H y p ? 
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26. Let Y be any subset of X and consider the subgroup 

Hy= f]H y 

yeY 

of S x . Characterize the left coset to which p e S x belongs. 

27. Suppose H is a subgroup of G; show that 

(i) the analogs of the properties of left cosets, which were listed in 
4-2-13, hold for the right cosets of H in G, 

(ii) G decomposes into a disjoint union of right cosets of H in G, 

(iii) Let us define a relation on G by 

a x b (mod H) o ba~ l e H 

(or equivalently <^> ab' 1 e H). Then this is an equivalence relation 
on G and its equivalence classes are the right cosets of H in G. 

28. Compare the meanings of a = b (mod H ) and a z b (mod H) when the 
group G is additive. Do so, in particular, when G is the group { Z, +} 
and H is the subgroup m Z. 

29. Suppose H is a subgroup of G. If G is finite, then any right coset has the 
same number of elements as any left coset, and the number of right cosets 
is equal to the number of left cosets. More generally, for arbitrary G, 
finite or infinite. 

(/) there is a 1-1 correspondence between the elements of any two left 
(or right) cosets. 

(ii) there is a 1-1 correspondence between the elements of any left 
coset and any right coset, 

(iii) there is a 1-1 correspondence between the set of left cosets and the 
set of right cosets. (Note: aH+->Ha will not do.) 

30. What happens to Lagrange’s theorem, 4-2-16, when G is infinite—that is, 
when #(G) = oc ? 

31. Use Lagrange’s theorem to help verify that 

( i ) The subgroups of { Z 12 , +} listed in 4-2-5 are indeed all subgroups, 

(ii) the subgroups of { Z* 3 , •} listed in 4-2-5 are indeed all subgroups, 

(iii) the subgroups of the octic group listed in 4-2-6 are indeed all sub¬ 
groups. Is the four group (see 4-1-12, Problem 16) one of these 
subgroups? 

32. Compare the groups { Z 12 , +} and { Z* 3 , •}. 

33. Use Lagrange’s theorem to help find all subgroups of the group of rigid 
motions of the 

(i) regular pentagon (see 4-1-12, Problem 17), 

(ii) regular hexagon (see 4-1-12, Problem 18). 

It is instructive to interpret the subgroups geometrically. 
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4-3. Cyclic Groups 

Consider an element a of the multiplicative group G, and let H be any 
subgroup of G which contains a. Since as H, we have a 2 s H and, inductively, 
a n s H for every n > 1. Furthermore, a 0 = e s H, and because a subgroup is 
closed under taking of inverses, a~ n = (fl") -1 e H for every n > 1. Thus, all 
powers of a (positive, negative, or zero) belong to H —in symbols: {d* | n s Z} 
ci H. 

Let us introduce the notation 

[a\ = {a n \ne Z}. 

Then [a] (call it “ bracket of a ” momentarily) is a subgroup of G —in fact, 
by the rules for exponents, given a 1 and a j in [< a ] (any two elements of [a] are 
of this form), then a 1 • a j = a l+J s [a] and a~ l s [a]. We see immediately that: 


4-3-1. Proposition. For any asG, [a] = {a n \ns Z} is a subgroup of G. 
Moreover, 

(/) [a] is contained in any subgroup of G that contains a, 

(ii) [a] is the intersection of all subgroups of G that contain a. 


Proof : (/) has already been done. In virtue of (/) and the fact that [a\ is 
a subgroup of G which contains a , it follows that (//) holds. | 

One sometimes summarizes this result by saying: [< a ] is the unique 
smallest subgroup of G that contains a. 

If the operation in the group G is addition, then instead of powers of a 
we are concerned with multiples of a —namely, 

a , a + a — 2a, a + 2a = 3a, ..., —a = (— 1 )a, — (2a) = ( — 2 )a 9 _ 

In other words, in this situation we have 

[a] = {na\n s Z}. 

Our discussion will deal almost exclusively with multiplicative groups. The 
reader will have no difficulty translating the results to additive groups. 

4-3-2. Definition. If there exists an element a in the group G for which 
G = [a\, then G is said to be a cyclic group and a is said to be a generator of 
G.,We also say that G is a cyclic group generated by a. 

More generally, for any as G, [a] is known as the cyclic group—or sub¬ 
group—generated by a (because, when we view the subgroup [a] of G as a 
group in its own right, it is clearly cyclic and a is a generator). 
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4-3-3. Examples. (1) Consider the additive group of integers, (Z, + }. 
What is [1]? Because we are in an additive situation, we know that 

[1] = {/i- l\ne Z}. 

In simple-minded fashion we obtain this by adding 1 to itself over and over 
(including zero times) and taking inverses (that is, negatives). In any case, 
we obviously have 

[1] = Z, 

so { Z, +} is a cyclic group, and 1 is a generator. If we compute [ — 1] in the 
same way, we see that [— 1] = Z—so — 1 is also a generator of { Z, + }. 
What about [2]? It consists of 2, 4, 6, 8,..., and also 0, —2, —4, —6, 

— 8,_More precisely, we know that 

[2] = {« • 21« e Z} = 2 Z N 

and also 

[-2] = {(n)(-2)\ne Z} = (-2)Z = 2Z. 

More generally, for any me Z 

[m\ = {n • m\ ne Z} = m Z 

and also 

[—m] = [m]. 

Because mZ=Z^>m=±\ this tells us, in particular, that +1 and 

— 1 are the only generators of {Z, +}. In addition, this result tells us that 
every subgroup of the cyclic group (Z, +} is cyclic—in fact, as seen in 
4-2-4, any subgroup is of form m Z with m > 0, and from above, m Z = [m] 9 
which is a cyclic group generated by m. 

(2) Consider the additive group of reals {R, +}. Let a be any nonzero 
element of R. Then the cyclic subgroup generated by a is 

[a] = {na\n e Z}. 

This cannot equal R; for example, there is no integer n for which noi = ^a, 
so the real number }oc does not belong to [a]. Therefore, {R, + } is an example 
of an infinite group which is not cyclic. 

(3) Consider the additive group {Z 6 , +}, whose elements are 0, 1, 2, 3, 
4, 5. What is [1]? By repeated addition of 1 to itself, we see that 1, 2, 3, 4, 
5,6 = 0,... all belong to [1]. Continuing the process of adding 1 beyond 
6 = 0, we have 1, 2, 3, 4, 5, 6 = 0, 1, 2, ..so the elements keep repeating 
in a 64 cyclical” fashion, and we have no new elements of [1]. Similarly, 
taking inverses gives —5=1, —4 = 2, —3 = 3, —2 = 4, —1=5, —0 = 0 
as elements of [1]. In short, [1] = Z 6 ; so { Z 6 , +} is a cyclic group with 
generator 1. 
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Computing (in the same way) the cyclic subgroups of { Z 6 , + } generated 
by each of the remaining elements we have 

[0] = (0), [1] = Z 6 , [2] = {0,2, 4}, 

[3] = {0,3}, [4] = {0,2, 4}, [5] = Z 6 . 

In particular, 5 (which equals — 1) is also a generator of Z 6 . 

Next, let us consider the additive group {Z 7 , +}, whose elements are 
0, 1,2, 3, 4, 5, 6. Since [1] contains 1, 2, 3, 4, 5, 6, 7 = 0,..., we have 

[1] = Z 7 —so {Z 7 , +} is a cyclic group, and 1 is a generator. Similarly, 

[2] contains 2, 4, 6, 8 = 1, 3, 5, 7 = 0,... which implies [2] = Z 7 . Even more, 
as is easily checked, 

z 7 = [ 1 ] = [2] = [3] = [4] = [5] = [6], 


so every nonzero element of Z 7 is a generator of the cyclic group. 

Pursuing this type of example to its logical conclusion: For any m > 1, 
the additive group {Z m , +} (whose elements may be taken as 0,1,2, 
...,m — 1) is cyclic since [1] = Z m . In particular, for any m > 1 there exists 
a cyclic group of order m —{ Z m , +} is such a group. The question of 
which elements of Z m are generators will be settled later in this section; for 
the moment it is left as a challenge to the reader. 

(4) Consider the group of rigid motions of the equilateral triangle— 
S 3 = { e , <r l9 <j 2 , r l9 t 2 , t 3 }—which was introduced in 4-1-9. This is a finite 
group which is not cyclic. To see this, it suffices to check that for each a e S 3 
we have [<j] ^ S 3 . 

Clearly, [ e ] = (e). As for [crj: it surely contains 

o u a \ = o 2 ,a\ = a t a 2 = e, o\ = a u a 5 = a 2 ,... 

and, also 

_0 _ „ 1 _ _ -2 _ 1 _ _ 3 _ „ _ —4 _ _ 

a l — e > a l —02 9 a l — °2 — a t9 — ^9 ^1 &2 - 

Thus because of the way the terms repeat, we have 

[<7i] = {<fy | n e Z} = {e, a u a 2 ). 

In similar fashion: 


[<r 2 ] = {e, <Ti, <7 2 } [Ti] = {e, tJ, 

[t 2 ] = {e, t 2 }, [t 3 ] = {e, t 3 } 

—so indeed S 3 is not cyclic. 

(5) Consider { Z 7 , • }; this is a multiplicative group consisting of the six 
elements 1,2, 3, 4, 5, 6. Let us find 

[2] = {2"|«e Z}. 
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We have, in Z 7 , 

2 1 = 2, 2 2 = 4, 2 3 = 1, 2 4 = 2, 2 5 = 4, 2 6 = 1,... 
and using 2 _1 = 2 2 = 4 (which follows from 2 1 • 2 2 = 2 3 = 1) we have also 
2° = 1, 2 -1 = 4, 2 -2 = (2 -1 ) 2 = 2, 2 -3 = (2 -1 ) 3 = 1, 2" 4 = 4,.... 
Consequently, 

[2] ={1,2, 4}. 

In the same way, we find 

M = [3] = {1,2, 3, 4, 5, 6} = Z*, [4] = {1,2, 4}, 

[5] = {1,2, 3, 4, 5,6}= Z*, [6] = {1,6}. 

In particular, Z* is a cyclic group—it has two generators, 3 and 5. 

The group { Z* 7 , • }, which consists of the 16 elements 1, 2, ..., 15, 16, 
is also cyclic. The cyclic group generated by 2 is easily seen to be 

[2] = {1,2, 4, 8, 9, 13, 15, 16}, 

so 2 is not a generator of Z* 7 . However, 3 is a generator, since the powers of 
3 include 


3, 3 2 = 9, 

3 3 = 3 • 9 = 10, 

3 = 3 4 • 10 = 13, 

3 5 =3-13 = 5, 

3 6 = 3-5 = 15, 

3 7 = 3 • 15 = 11, 

3 8 = 3- 11 = 16, 

3 9 = 3 • 16 = 14. 

3 10 = 3 • 14 = 8, 

3 11 = 3*8 = 7, 

3 12 = 3 • 7 = 4, 

3 13 = 3 • 4 = 12, 

3 14 = 3 • 12 = 2, 

3 15 =3-2 = 6, 

3 16 = 3-6 = 1, 

3 17 =3,.... 


The reader who does not mind doing arithmetic can check that 3, 5, 6, 7, 10, 
11, 12, 14 are generators of Z* 7 —in other words, 

Z? 7 = [3] = [5] = [6] = [7] = [10] = [11] = [12] = [14] 

—whereas the remaining elements of Z* 7 are not generators. 

More generally, later in this section we shall see that { Z*, •} is cyclic 
for any prime p , and also obtain some information about the generators of 
such a cyclic group. 

On the other hand, { Z*, •} is not cyclic for all m [for example, Z* 2 — 
{1,5,7,11} is not cyclic, since [1] = {1}, [5] = {1, 5}, [7] = {1, 7}, [11] = 
{1, 11}] and {Z*, •} may be cyclic even if m is not prime (for example, 
Z* 0 = {1, 3, 7, 9} is cyclic with generators 3 and 7). The full story as to when 
{ Z*, •} is a cyclic group will unfold in Section 4-6. 

(6) Consider the group of nth roots of unity, W n9 where n is any positive 
integer (see 4-2-8). This is a cyclic group of order n ; in fact, as seen in 4-2-8, 
the element £ = cos a + i sin a, where a = 360/«, is a generator. 
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(7) Any group G of prime order p is cyclic. In fact, suppose a e G is any 
element different from the identity. Then [a] is a subgroup of G, so (by 
Lagrange) its order divides p. Since [a] contains more than one element (both 
a and the identity belong to [a]), its order must be p; therefore, [a\ = G and 
G is cyclic. Our proof also shows that for a group with prime order, every 
element, except the identity, is a generator. 

(8) Obviously, any cyclic group is abelian. Conversely, it is almost 
equally obvious that an abelian group need not be cyclic—for example, 
{ Z* 2 , *} is abelian but not cyclic. 

4-3-4. Remark. Cyclic groups are characterized by the fact that they can 
be generated by a single element. We digress here to place a group, or a 
subgroup, of form [a] in a more general context. 

Let S be any nonempty subset of the group G ; in particular, S may consist 
of the single element a . Consider all expressions obtained by writing a finite 
product whose individual terms are either elements of S or inverses of elements 
of S . Typical expressions of this type look as follows (with a , b , c , d, e S): 

a, abed , ba~ 1 , c~ i 9 dabb~ 1 cda ~ 1 , 

bbbaac~ i ccd~ i a~ i a~ i a~ 1 bac ~ 1 , baa~ i b ~ i , aaa 9 c -1 , c ~ l . 

The definition is worded in such a way that powers (other than 1 and — 1) 
are not permitted, but we can certainly combine like terms and use powers 
without affecting anything. 

Let [ S ] denote the set of all elements of G which are represented by 
expressions (or “words”) of the given type. Of course, the expression for 
an element of [ S ] is not unique—for example, aa~ x and bb -1 both represent 
the element e (so, in particular, e e [ S ]). Clearly, [S] is closed under multiplica¬ 
tion and taking of inverses; so [ S ] is a subgroup of G, and is known as the 
subgroup of G generated by S . Note that if S — {a} (that is, S consists of 
the single element a ), then, according to the definition [ S ] = {a n | n e Z}—so 
[S] equals what we have earlier denoted by [a], and the notations for [a] and 
[S] are compatible. (Strictly speaking, one should write [{«}] instead of [«].) 

Among the properties of [ ], which the reader can easily verify, we have: 
S <= [S]; S c= T=> [S] c= [T]; if H is a subgroup which contains S', then 
H zd [S']; [S] is the intersection of all subgroups of G which contain S'; [S] is 
the unique smallest subgroup of G which contains S; Sis a subgroup o [ S ] =S. 

As seen in the examples considered in 4-3-3, one way to find [a] is to start 
taking powers of a . If and when we arrive at a m — e 9 then, apparently, all 
the elements of [a] — {a n | n e Z} are already in hand. In particular, the set 
{a n | n e Z} need not be infinite, and distinct powers of a may be equal. Let 
us analyze the situation more carefully. 
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4-3-5. Definition. Let a be an element of the multiplicative group G. 
By the order of a we mean the smallest positive integer m for which a m = e , 
and we then write ord a = m. (In particular, if a = e, then m = 1—so ord. e = 

I. ) If no such m exists, then a is said to be of infinite order, and we write 
ord a —oo. 

If the group G is additive, then our definition becomes ord a is the smallest 
positive integer m for which ma = 0. 

Let us illustrate this notion. In the group { Z* 3 , •} we have: ord 2 = 12 
since the positive powers of 2 are 2 1 , 2 2 = 4, 2 3 = 8, 2 4 = 3, 2 5 = 2 • 3 = 6, 
2 6 = 2*6 = 12,2 7 = 2* 12 = 11,2 8 = 2* 11 = 9,2 9 = 2 • 9 = 5,2 10 = 2 • 5 = 
10, 2 11 = 2 • 10 = 7, 2 12 = 2 • 7 = 1; ord 3 = 3, since 3 1 = 3, 3 2 = 9, 3 3 = 1; 
ord 4 = 6, since 4 1 = 4, 4 2 = 3, 4 3 = 4 • 3 = 12, 4 4 = 4 • 12 = 9, 4 5 = 4 • 9 = 
10, 4 6 = 4 • 10 = 1; ord 5 = 4, since 5 1 = 5, 5 2 = 12, 5 3 = 5 • 12 = 8, 5 4 = 
5*8 = 1. 

In the additive group {Z 24 , + } we have: ord 2 = 12, since the positive 
multiples of 2 are 1 • 2 = 2,2 • 2 = 4, 3 • 2 = 6,4 • 2 = 8, 5 • 2 = 10, 6 • 2 = 12, 
7 • 2 = 14, 8 • 2 = 16, 9 • 2 = 18, 10 • 2 = 20, 11 • 2 = 22, 12 • 2 = 0; ord 3 = 
8, since 1*3 = 3, 2*3 = 6, 3*3 = 9, 4*3 = 12, 5*3 = 15, 6*3=18, 
7 • 3 = 21, 8 • 3 = 0; ord 4 = 6, since 1 • 4 = 4, 2 • 4 = 8, 3 • 4 = 12,4 • 4 = 16, 
5 * 4 = 20, 6*4 = 0; ord 5 = 24 since 1*5 = 5, 2 * 5 = 10, 3*5=15, 
4*5 = 20,5*5 = 4*5 + 5 = 20 + 5=1,6*5 = 5*5 + 5 = 6, 7*5 = 6 + 5 = 

II, 8*5 = 16, 9*5 = 21, 10*5 = 2, 11*5 = 7, 12*5=12, 13*5 = 17, 
14*5 = 22, 15*5 = 3, 16*5 = 8, 17*5 = 13, 18*5 = 18, 19*5 = 23, 
20 • 5 = 4, 21 • 5 = 9, 22 • 5 = 14, 23 • 5 = 19, 24 • 5 = 0. 


4-3-6. Proposition. For an element a of the multiplicative group G , we have 
ord a = oo o all powers a n , n e Z, are distinct 

and, when this is the case, ord a = #[a]. 

Proof : To prove =>, suppose ord a — oo. If all powers a n are not distinct, 
then we have a { = a J for some i^j —say, i<j. But then a j ~ l = e with 
j-i> 0, which contradicts the hypothesis that ord a — oo. 

Conversely, if all powers a n are distinct then, in particular, a 0 = e is not 
equal to a m for any m > 0 — so ord a — oo. This proves <=. 

When the hypotheses hold, the group [a] = {a n | n e Z} has an infinite 
number of elements—that is, its order #[a\ is oo, which is equal to ord a. | 

4-3-7. Proposition. For an element a of the multiplicative group G , we have 
ord a < oo o not all a n , n e Z, are distinct. 
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If this is the case, and ord a = m, then 

(/) d 1 = e o m\n , 

(k) a 1 = a J o i = j (mod m), 

(i in ) the elements e — a° 9 a 1 , a 2 ,..., a m_1 are distinct, 

O’f) [a] = {e,a,a 2 , 

(v) ord a = #[a\. 

Proof : The first assertion is equivalent to the assertion of 4-3-6. In fact, 
taking “negatives” in 4-3-6, we have: It is false that ord a — oo <^> it is 
false that all a n are distinct. This says: ord a < oo o not all a n are distinct. 
Now, suppose ord a = m; then 

(/) If m | n 9 say n = md, then a n = (a m ) d = (e) d = e —which proves <=. 
Conversely, if a n = e , then, by the division algorithm, we may write 

n = qm + r, 0< r <m 

Therefore, 

e = a n = a qm+r = a qm a r = (a m ) q a r = a\ 

Since 0< r < m, and m is the smallest positive integer for which a m = e 9 it 
follows that r = 0—thus, n — qm 9 m | n 9 and the implication => is proved. 

(h) Using part (/), we have 

a 1 — a j o a j ~ l — e o m\(j — i) o i=j (mod m). 

(iii) In virtue of (ii) the elements e = a° 9 a 1 , a 2 , ..., a m ~ l are distinct 
because no two of their exponents 0, 1, ..., m — 1 are congruent (mod m). 

(iv) Obviously, {e = a° 9 a 1 , a 1 , ..., a m ~ 1 } a {a n | ne Z} = [a]. On the 
other hand, given a n there exists an integer i 9 satisfying 0< i<m — 1, for 
which n = i (mod m) and hence a n = a \ Thus, the inclusion 3 holds, and 
we have 

{e, a, a 2 ,a m ~ 1 } = [a], 

(v ) Now the group [a] has m elements—so 

# [a\ = m = ord a. 

The proof is now complete. | 

The word “ order ” has appeared in the present context in two ways— 
namely, as the order, #((/), of a group G (see 4-2-16) and also as the order, 
ord a 9 of an arbitrary element of a group (see 4-3-5). A minor consequence 
of the two preceding results is that the two uses of the word “ order ” are 
compatible—namely, for an arbitrary element a of a group G, its order is 
equal to the order of the cyclic group [a\ which it generates. 
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4-3-8. Remark. Some of the information contained in the two preceding 
results, 4-3-6 and 4-3-7, may be restated as follows: 

Suppose the group G is cyclic; then 

(/) If #(G) = oo, then G is of form G = [a] — {a n \ n e Z}, where all the 
powers of a are distinct. 

(h) If #(G) = m , then G is of form G = [a] = { e , a 9 a 2 9 ..., 0 m_1 }, where 
e, a 9 ... 9 fl™ -1 are distinct. 

An interesting result, related to 4-3-7, goes as follows. Suppose G is an 
arbitrary finite group—of order n, say—and consider any a e G. Since [a] is 
a subgroup of G , Lagrange says that the order of [a\ divides the order of 
G — #[a ] | #(G). But ord a = #[a\; so ord a (call it m) divides #(G) = «. In 
particular, every element of a finite group has finite order dividing the order 
of the group. Furthermore, since a m = e and m \n 9 we conclude that a n = e . 
In words, raising any element of the finite group G to the power # (G) gives 
the identity. We have proved: 

4-3-9. Proposition. Suppose G is a finite group; then for any ae G, 

(0 (ord a) | # (G), 

(//) a #(G) = e. 


This seemingly innocuous result is of especial importance for us because 
it enables us to understand the real meaning of the theorems of Fermat and 
Euler [see 2-4-9 and 3-2-22, part (7)]. In more detail, we have: 


4-3-10. Corollary. (Euler-Fermat). For any m> 1, consider the multi¬ 
plicative group { Z*, •} whose order is </>(m). Then for any a e Z*, 

a* (m) = 1 (in Z*). 

In particular, if m is prime, m— /?, then 

or- 1 = 1 (in Z*). 

Furthermore, when these facts are translated into the language of con¬ 
gruences, they take the form: 

If a is an integer with (a, m) = 1, then 

a^ m) = 1 (mod m) 

and, in particular, if m — p 9 then we have: 

a p ~ l = 1 (mod p) 9 p)( a. 
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As illustrations of this result we have statements such as: 

(1) Because </>(12) = 4, every element a of Zj* satisfies a* — 1—that is, 
in Z* 29 l 4 = 1, 5 4 = 1, 7 4 = 1, ll 4 = 1. Expressed in terms of congruences 
(mod 12) this says: For any integer a prime to 12, we have, a 4 = 1 (mod 12)— 
in particular, l 4 = 1 (mod 12), 5 4 = 1 (mod 12), 7 4 = 1 (mod 12), ll 4 = 1 
(mod 12). 

(2) Because </>(30) = 8, every element a of Z 3 * satisfies a 8 = 1; or in 
terms of congruences, if a is an integer prime to 30, then a 8 = 1 (mod 30). 

(3) Since 29 is prime, </>(29) = 28, and every element a of Z* 9 satisfies 
a 28 — 1—or, for any integer a # 0 with 29 fa 9 we have a 28 = 1 (mod 29). 

From their very nature, it is clear that cyclic groups are the simplest type 
possible. We know more about them than about any other groups. The 
complete nature of our information is made manifest when we study their 
subgroups—for, as our next result shows, we can answer such questions as: 
what does any subgroup look like? is it cyclic? for which orders does a 
subgroup exist? how many subgroups are there of a given order? 


4-3-11. Theorem. If G is a cyclic group, then every subgroup H is cyclic; 
in fact, if G = [a] and H ^ (< e), then H = [a n ] 9 where n is the smallest 
positive integer for which a n e H. Moreover, 

(1) If #(G) =oo, then every subgroup [except ( e )] is infinite cyclic. For 
positive n 9 the subgroups [a n ] are distinct, and there are no other sub¬ 
groups [except (e)]. 

(2) If #(G) = m, then 
(/) n\m. 

( ii ) The order of any subgroup is a divisor of m 9 and for every divisor 
d of m there exists a unique subgroup of G of order d —namely, 
[a m/d ]. 

( Hi ) The set of all subgroups of G is 

{[A\d\m} = {[^“]\d\m}. 

Proof : If H=(e) 9 then H is cyclic with generator e —that is, H=[e\. 
Suppose, therefore, H ^ ( e ). Because G = [a\ 9 there exists an element a s e H 
with a s ^ e. We may assume ^ is positive, for if it is not, then — ^ is positive 
with a~ s = (a s )~ i eH and a~ s ^ e. In particular, there exists a positive 
integer ^ for which a s e H . Let n be the smallest positive integer satisfying 
this condition. Since a n e H 9 we have 

[a n ] c= H . 
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To prove the reverse inclusion consider any a* e H. According to the division 
algorithm, we may write (even if n = 1) 


and then 


t = qn + r, 0 < r < n 


a* = a qn a r = (a n ) q a r . 

Because a * and (a n ) q belong to the subgroup //, so does a r . By the choice of 
n, it follows that r = 0; so t = qn and a t = (a n ) q e [a"]. Thus, H <= [a”] and we 
have proved 

[«"]=#• 

Now, let us turn to the remaining parts. 

(1) Suppose #((7) = oo. Then an arbitrary subgroup H i=- ( e ) is of form 
H = [a n ] with n > 1. Since ord a = oo, it follows that ord a n — oo (for if 
ord 0 " = t , then = e, contradicting ord a = oo) and (by 4-3-6) H — [a n ] is 
infinite cyclic. 

We now know that for every n > 1 we have a subgroup [a n ], and [except 
for (e)\ there are no other subgroups. To show they are distinct, suppose 

[a n ] = [a n '\ n> 1, n' > 1. 

By listing the elements of these two subgroups it is easy to see that n = n '— 
but let us give a more formal proof. Since a n ' e [a n ], we may write a n ' = (a n y — 
a nt for some / eZ, / #0. Consequently, making use of 4-3-6, ri = nt. In 
particular, 

n < n !. 

Arguing by symmetry, starting from a n e [a n '] 9 we obtain n' < n. Therefore, 

n = n! 

and the subgroups [a n ] are indeed distinct. This completes the proof of (1). 

(2) Suppose #(G) =m. Then an arbitrary subgroup H ^ (e) is of form 
H = [a n ], where n is the smallest positive integer for which a n e H. As a 
matter of fact, in the current situation this statement also holds when H — 
(e )—for then a m =e is the smallest positive power of a which belongs to //, 
and H = [a m ]. 

Consider any subgroup //; it is of form [a n ]. Since 

a ™ = ee H= [a n ] 

it follows (by the very same argument used earlier to prove H = [a*]) that 
m — qn for some q e Z. Hence, n | m, which proves (/). 
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We turn to (//). By Lagrange, the order of any subgroup of G is a divisor 
d of m = #(<7). Conversely, consider any positive divisor d of m —the cases 
d = 1 and d = m are included. Put rri — mjd. Let us look at the powers 

( a m ') 1 for i =0, 1, ..d — 1. Since (a m 'y = a im \ they are 

(a m ')° = a 0 = e, a m \ a 2m \ ..., a (d ~ 1)m \ (*) 

The exponents are all distinct and less than m; so no two exponents are 

congruent (mod m) and, in virtue of 4-3-7, the terms of (*) are distinct. In 
particular, in (*), no term after the first one is equal to e. Since (a m ') d = a dm ' = 
e m = e , we have ord (a m ') = d. Consequently, using 4-3-7 

#[a m '] = ord (a mf ) = d 

—so [a m '] is a subgroup of order d = m/m'. 

It remains to prove uniqueness. Suppose H is any subgroup of order 
d , #(//) = d . Then H is of form H =[a n ] where n is the smallest positive 
integer for which a n e H and, by part (0, n | m. The argument used above 
leads to the conclusion: 


Consequently, d — m/n and n = mjd. We have shown: If H is a subgroup of 
order d, then H = [a m/d ] —which depends only on d. Therefore, there exists 
a unique subgroup of order d, and the proof of (ii) is complete. 

Clearly, mjd runs over all divisors of m as d runs over all divisors of m , 
and consequently 

{[a d ]\cl\m} = {[cr^dlm] 
is the set of all subgroups of G. This proves (in). | 

4-3-12. Examples. We give some concrete illustrations of the preceding 
theorem. 

(1) Consider the additive group { Z, +}. It is infinite cyclic with Z = [1], 
(see 4-3-3) and ord 1 = #(Z) = oo. Keeping in mind that, because the 
operation is addition, powers a n are replaced by multiples na , 4-3-11, part 
(/) says: Every subgroup is cyclic; in fact, for every n> 1 the subgroup 
[n • 1] = nZ is infinite cyclic, these subgroups are distinct, and there are no 
other subgroups [except (0)]. 

Of course, this information is nothing new, it was proved earlier in 4-2-4. 

(2) Consider the multiplicative group { Z* 3 , •}. Its underlying set consists 
of the 12 elements 1, 2,..., 11, 12. It is cyclic—in fact, because the powers of 
2 are 2, 4, 8, 3, 6, 12, 11, 9, 5, 10, 7, 1, the element 2 is a generator. Thus, 

Z* 3 = [2], ord 2 = #(Z* 3 ) = 12. 
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According to 4-3-11, part (//), because the divisors of 12 are 1, 2, 3, 4, 6, 12 
all the subgroups of Zf 3 are as follows. 

[2 1 ], [2 2 ], [2 3 ], [2 4 ], [2 6 ], [2 12 ] 

and their orders are, respectively, 

12, 6, 4, 3, 2, 1. 

In more detail, the subgroups are: 

[2 1 ] = Z* 3 , 

[2 2 ] = [4] = {4, 3, 12, 9, 10, 1} = {1, 3, 4, 9, 10, 12}, 

[2 3 ] = [8] = {8, 12, 5,1} = {1,5, 8, 12}, 

[2 4 ] = [3] = {3, 9,1} = {1,3, 9}, 

[2 6 ] = [12] = {12, 1 } = {1, 12}, 

[ 2 12 ] = [ 1 ] = { 1 }- 

Incidentally, we have proved the assertions made in 4-2-5 about the sub¬ 
groups of { Zf 3 , •}. 

(3) Consider the additive group {Z 12 , +}. As observed in 4-3-3, part 
(3), this is a cyclic group— 

Z 12 = [1], ord 1 = #(Z 12 ) = 12. 

Being a cyclic group of order 12, this group behaves entirely like the cyclic 
group of order 12, { Z* 3 , •}. In more detail, the set of all subgroups is 

{[d-X]\d\\2} 

that is: For d = 1, 2, 3, 4, 6, 12 we have the subgroups 

[1] = z 12 

[2] = {2, 4, 6, 8, 10, 0}, 

[3] = {3, 6, 9, 0}, 

[4] = {4, 8,0}, 

[6] = {6, 0}, 

[12] = [0] = (0). 

We have proved the assertions about the subgroups of { Z 12 , +} that were 
made in 4-2-5. 

What about the order of any element a r of the finite cyclic group Ci = [a] ? 
Which subgroup does a r generate? When is a r a generator of G? These 
questions are now easy to settle. 
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4-3-13. Proposition. Suppose G = [a] is a cyclic group of order m or, 

more generally, suppose a is an element of an arbitrary multiplicative 

group G and 

ord a= # [a] — m. 

Then for any integer r, 

(i) ord (cf) = m/(r, m), 

(ii) [cf] = [a^l 

(iii) cf generates [a] or is relatively prime to m , 

(iv) the group [a] has cf)(m) generators. 

Proof : We put ord (cf) = t and d = (r, ra). Then we may write r = dr\ 
m = dm ', and (r\ m') = 1. Note that m/(r , m) = m'. 

Since 

(flO”' = fl rM/ = = (a m T = e 

it follows that t — ord(^) divides m'. On the other hand, 

t = ord(rf') => (a r y = e => a rt = e 
=> m\rt => m f \r't 
=> m' \t [since (m\ r') — 1]. 

Thus t = m\ and (0 is proved. 

(ii) In virtue of (/), [a r \ is the subgroup of [a\ of order m/(r, m), and 
according to 4-3-11, part 2(ii), the unique subgroup of order m/(r, m) is 
[fl (r ’ w) ]. Hence 

[cf\ = [« (r ’ m) ]. 

(iii) In virtue of (//), 

Vn = W o [« (r>m) l = [a] 

o ord[a (|,,w) ] = ord[fl] (using 4-3-11) 
o m/(r, m) = m 
o (r, m) — 1. 

(iv) If r = r' (mod m ), then cf = cf\ so in looking for generators it 
suffices to consider only r = 1, 2, ..., m — 1. Then from (iii), cf is a generator 
o (r,m) = 1. The number of integers r = 1, 2, ..., m — 1 which are relatively 
prime to m is </>(m); therefore, the number of generators of [a] is cj)(m). | 

4-3-14. Example. In 4-3-3, part (5), we saw that { Zf 7 , •} is cyclic and 3 
is a generator. Thus 

Z* 7 = [3] and ord 3 = #(Z* 7 ) = 16. 
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According to 4-3-13, with a = 3, m = 16 we know that Z 17 has <£(16) = 8 
generators. In fact, for r = 1,2,..., 15, 3 r is a generator o (r, 16) = 1. 
Therefore, the generators are 

3 1 = 3, 3 3 = 10, 3 5 = 5, 3 7 = 11, 3 9 = 14, 3 11 = 7, 3 13 = 12, 3 15 = 6 
and we have 

Z? 7 = [3] = [5] = [6] = [7] = [10] = [11] = [12] = [14] 
as was indicated in 4-3-3. In particular, in Zf 7 , 

16 = ord 3 = ord 5 = ord 6 = ord 7 = ord 10 = ord 11 = ord 12 = ord 14. 
As for the orders of the remaining elements, since 

1=3°, 2 = 3 14 , 4 = 3 12 , 8 = 3 10 , 9 = 3 2 , 13 = 3 4 , ’ 15 = 3 6 , 16 = 3 8 , 
we have 

ord 1 = 1, ord 2 = ord(3 14 ) = — - - 6 — = 8, 

(14,16) 

ord 4 = ord(3 12 ) = 16 = 4, ord 8 = ord(3 10 ) = - 16 = 8, 

(12, 16) (1U, 16) 

ord 9 = ord(3 2 ) = —= 8 ord 13 = ord(3 4 ) = —= 4, 

(2,16) (4, 16) 

ord 15 = ord (3 6 ) = —= 8, ord 16 = ord(3 8 ) = —= 2. 

(6, 16) (o, 16) 

The “proper” subgroups (meaning the nontrivial ones—that is, those 
other than (1) and Z* 7 ) are 

[2] = [8] = [9] = [15], whose order is 8. 

[4] = [13], whose order is 4. 

[16], whose order is 2. 

One may check that these subgroups are 

[2] = {1,2, 4, 8, 9, 13, 15, 16}, [4] = {1, 4, 13, 16}, [16] = {1,16}. 

We conclude this section by exhibiting an important class of cyclic groups. 
The first step is to prove a basic fact about the Euler ^-function. 
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4-3-15. Proposition. For every n > 1, 

I m = «■ 

d\n 


Proof : It is understood that only positive divisors d of n are considered. 
It may be noted that two proofs of this result were indicated in 3-2-23, 
Problem 25; we provide the details of the second proof. 

For each d\ n let S(d) denote the subset of the set {1, 2, ..., n} consisting 
of those integers r whose greatest common divisor with n is d; in symbols 

S(d) = {r e {1, 2,| (r, n) = d}. 

Let A(d) denote the number of elements in the set S(d). 

As the reader will observe, every integer r e {1,..., n} belongs to exactly 
one of the sets S(d) —or, to put it another way, the sets S(d) are disjoint and 
their union is {1, 2,...,«}. In particular, we have 

n = Y, Kd). 

d\n 

If (r, n) — d , then r must be a multiple of d; so r must come from the set 

f n 

< d , 2d,..., md 9 ..., - d 

But we are interested only in the elements md from this set for which (md, n) = 
d. Since 

-H)'' 


we have 

(md , n) = d o 

It follows that 

(r, n) = d o r — md where 1 < m < - and I m, I = 1, 



(md, n) 


= { md ’i d ) 


from which we conclude 


As d runs over all divisors of n , so does njd. Hence, 


n = Kd) = Z 4>\^J = I 


d\n 


d\n 
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4-3-16. Theorem. For every prime p, { Zj, •}, the multiplicative group of 
the field Z p , is cyclic. 


It involves little additional effort to prove the following more general 
result: 

4-3-17. Theorem. Let F be any field and F* denote its multiplicative 
group. If G is a finite subgroup of F*, then G is cyclic. 

Briefly stated: Any finite subgroup of the multiplicative group of a 
field is cyclic. 


Proof: Let #(G) = n . For each divisor d of n , let 

\j/(d) = the number of elements of G whose order is d. 

Since the order of any element of G is a divisor of n, we have clearly 

» = £ Hd). 

d\tt 

Fix a divisor d of n. If \j/(d) ^ 0, then there exists a e G with ord a = d. 
Thus, the group 

[a\ = {1 ,a, a 2 , ...,a d-1 } 

consists of d distinct elements, and (since a d = 1) each of them is a root of 
the polynomial x d — 1 e F[x]. According to 3-5-5, the polynomial x d — 1 can 
have at most d distinct roots in F. Therefore, x d — 1 has exactly d distinct 
roots in F— they are, 1, a,.... 

Now, suppose b e G is any element of order d ; then, as above, 

[ 4 ] = { 1,4 . 4 - ' 1 } 

is the set of all roots (in F) of the polynomial x d — 1. Therefore, [Z>] = [a] and 
in particular, b is a generator of the cyclic group [a]. Since [a] has order d , 
we know (from 4-3-13) it has (f)(d ) generators. Thus, any element b of order 
d must be one of the c/)(d) generators of [a]. This implies, \j/(d)< <j)(d ) [under 
the hypothesis that \ji{d) ^ 0]. Of course, if \ji(d) — 0, then surely i//(d)< $(d). 
Consequently, 

iji(d)< (j)(d ), for every d\n. 

Making use of 4-3-15, we have: 

n = £ <Kd) < £ <l>(d ) = n. 

d\tt d\tt 



4-3. CYCLIC GROUPS 


437 


This implies \j/(d) = $(d ) for every d\n. In particular, \j/(ri) = </>(«) =£ 0—so 
there exists an element of order n 9 and G is cyclic. | 

This result tells us, for example, that because 229 is prime, the group 
{ Z* 29 > *} is cyclic of order 228. Although this cyclic group has </>(228) = 72 
generators (by 4-3-13), our result provides no information on how to locate 
a generator. 

4-3-18 {PROBLEMS 

1. In {Z 10 , +}, find the cyclic subgroups: 

(0 [2], (#) [4], (Hi) [6], (iv) [11 (v) [8]. 

2. In { Zf 4 , •}, find the cyclic subgroups: 

(0 [3], 0*0 [5], (iii) [9], (iv) [11]. 

3. For each element of the group { Z 15 , +}, find the cyclic subgroup which 
it generates. Which elements are generators of Z 15 ? 

4. For each element of the group { Zf 5 , •}, find the cyclic subgroup which 
it generates. Which elements are generators of Zf 5 ? 

5. Do the same thing for { Zf 8 , •}. 

6. Which of the following groups is cyclic? 

(0 {Q> +}> 0*0 {R> +}> 

(ii) {Q*, •}, (iv) {R*, •}, 

(v) {W 9 •}, where W — {z e C | |z| = 1}. 

7. Which of the following groups is cyclic? 

(0 { Zt •}, 07) { Z*, •}, (iii) { Z* 3 , •}, (iv) { Z*,,.}. 

For those that are cyclic, exhibit a generator. 

8 . What is the order of: 

(0 2 in {R*, •}, (iii) V2 in {R*, •}, 

(ii) i in {Q, +}, (iv) 1 + i in {C, +}, 

(v) 1 + / in {C*, •}, (vi) -1—^- in {C*, •} ? 

V2 

9. For each a in the octic group find [a]. Is this group cyclic? 

10. Suppose G is cyclic with generator a; is it true that a~ l is also a generator? 

11. Prove: Every group with at most five elements is abelian. 

12. Show that if the elements a 9 b 9 ab of the group G are all of order 2, then 
ab = ba. 
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13. For any a,beG show that the elements ab and ba have the same order. 

14. Suppose G is a cyclic group of order m, and r is an integer relatively 
prime to m\ then for b, ce G prove that 

br = c r =>b = C . 

15. Show that in an abelian group, the set of all elements of finite order is a 
subgroup. 

16. In the symmetric group find two distinct subgroups of order 

(0 2, (//) 3, (in) 4. 

17. The symmetric group S 5 has 120 elements. For which factors m of 120 
can you find an element of S 5 of order m ? 

18. For each m > 1, prove that the symmetric group S m contains a cyclic 
subgroup of order m. 

19. Prove: A group of even order contains an odd number of elements of 
order 2. 

20. Find all generators of the cyclic group { Z 24 , +}. 

21. Show that the integer a = 1, 2, ..., m — 1 is a generator of { Z m , +} o 
(a, m) = 1. 

22. (i) Let a and b be elements with ab — ba and of orders r and s , 

respectively; if (r, s) = 1, show that ord(aZ>) = rs. 

(ii) If r and s are not relatively prime, then ord(flZ>) | [r, s]. Give an 
example where ord(flZ>) ^ [r, s]. 

23. If G has order pq , where p and q are distinct primes, then to find all 
proper (that is, nontrivial) subgroups show that it suffices to compute 
[a] for every aeG. 

24. Let S be a nonempty subset of the group G; if we write, as usual, S' -1 = 
{a~ l | a e S'}, show that 

25. Characterize the following subsets of the group {Z, +}, for arbitrary 
integers m and n : 

(/) m Z n n Z, (ii) m Z u n Z, (iii) [m Z u n Z]. 

26. Prove: (i) If G is infinite cyclic, then for every d> 0 there exists a 

unique subgroup of index d in G. 

(ii) If G is cyclic of order m , then for every d\m there exists a 
unique subgroup of index d in G. 
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27. If G is cyclic of order m 9 show that the number of distinct subgroups of 
G is equal to the number of divisors of m. 

28. (/) Suppose G is an abelian group of order 6. If there exists an element 

ae G with ord a = 3, then prove that G is cyclic. 

(ii) S 3 is a nonabelian group of order 6. Can you find another such 
group ? 

29. State and prove the additive analogs of: 

(i) 4-3-6, (ii) 4-3-7, (Hi) 4-3-8, 

(iv) 4-3-9, (v) 4-3-11, (vi) 4-3-13. 

30. In { Z, +} find [S] when S is the set: 

(0 {3, 4}, (ii) {3, 6}, (Hi) {3, 4, 6}, (iv) {9, 12}. 

31. Do Problem 30 in the group: 

(0 {^i 9 , +}, (ii) {Z 20 , +}> («/) {Z 24 , +}, (iv) {Z 30 , +}• 

32. In { Zf 3 , •} find [S] when S is the set: 

(0 {2, 5}, 00 {3, 4}, (Hi) {3, 7}, (iv) {2, 6}. 

33. Do Problem 32 in the group: 

0) { z* , •}, 00 { Z* , •}, 070 { Zf 9 , •}, (iv) { Z 2 * 3 , •}. 

34. Show that every group G (with more than one element) has at least two 
generating sets. 

35. Can you find a generating set for S 4 consisting of two elements? What 
about S 5 ? 

36. If H is a subgroup of <7, then prove that [G — H] = G. 

37. 0) Fi n< i subgroups of the cyclic group { Z 30 , +}. 

(ii) If G = [a\ is a cyclic multiplicative group of order 30, find all its 
subgroups. 

38. If G x is cyclic of order m and G 2 is cyclic of order n , then, if ( m , n) = 1, 
show that the direct product G 1 x G 2 (see 4-1-12, Problem 26) is cyclic 
of order mn. 

39. Prove that if the direct product G x x G 2 is cyclic, so are G 1 and G 2 . 


4-4. Normal Subgroups; Factor Groups; Homomorphisms 

Consider a multiplicative group G and a subgroup H . In Section 4-2 we 
introduced the left cosets 

aH— {ah\h e H}, aeG 
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and saw that they provide a disjoint decomposition of G. Furthermore, 
according to 4-2-15, if we define 

a = b (mod H) o b~ l a e H, 

then “= (mod H) ” is an equivalence relation. The equivalence class to which 
a belongs consists of {b e G \ b = a (mod //)}, and when it is denoted by , 
we have | a \ H = aH. 

The question we wish to confront is whether the set of left cosets of H 
in G can be made into a group. Why does one even raise this question? 
Simply because it is related to something we have done before. In detail, 
suppose we take for G the additive group of integers, G = Z, and let H be 
any subgroup not equal to (0). Then H is of form H = m Z for some m > 1. 
Because our group is additive, a coset looks like 

a + H = a + mZ = {a + mt\t e Z}= | a | m ? ae Z. 

Again, because our group is additive, the equivalence relation 66 = (mod H) ” 
mentioned above takes the form a = b (mod H) o b — as H, and the 
equivalence class to which a belongs is 



Another way to see this is as follows: We have 

a = b (mod H) o b — ae H 
ob — asm 
o m\(b — a) 
o a= b (mod m) 

so the equivalence relation “ = (mod H) ” is the same as the familiar equiva¬ 
lence relation “ = (mod ra).” In particular, their equivalence classes are 
identical, so for any ae Z we have 



Now, in Chapters I and II we saw how the residue classes (mod m) can 
be made into a ring. Here we are concerned only with the fact that the residue 
classes (mod m) form a group under the operation 

l a L + |*L H fl + *L 

—or, to phrase it in terms of cosets, the cosets of H = m Z in the additive 
group Z constitute a group under the operation 

Mw+Mw = l a + z, L 
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This concrete situation suggests that in the general situation where H is 
a subgroup of the multiplicative group G we try to make the set of left cosets 
into a group by defining 

Lfjrr • £]* =1 ^h 

—in other words, the left coset determined by a times the left coset determined 
by b is the left coset determined by ab . 

Clearly, the set of left cosets is closed under this operation. The associative 
law holds since \_^_\h(\A\h[£_\h) and (MfflAU)LflU are both equal to | abc \ H . 
The element (that is, left coset) |_fJ H is an identity, since \j_\ H |_aj H =1 a \ H 
= LiL|tf Lllff f° r 1 g Iff« Finally, any [_aJ H has an inverse—in fact, [ a l^ 1 
= I a ~ l I h > since [oJ H I a ~ 1 \ h = I a ~ 1 1 h \ a I h — LfJ h • 

Thus, the set of left cosets of H in G is a group—except for one important 
detail. The definition of the operation depends on the choice of the coset 
representatives, and consequently, we need to know that the operation is well 
defined. In other words, in order for the left cosets to be a group we must be 
certain that 


Unfortunately, this is not always true. For example, suppose G = S 3 and 
H is the subgroup {<?, rj [where, as defined in 4 - 1 - 9 , r t = (| \ £)]• As seen 
in 4-2-12 (H being the subgroup denoted there by H^, we have left cosets 

K L = <7i H = {(7! , T 3 } = Tjtf = |T3_| H t 

I n~2 | w = T 2} = — \ X 2 I ff. 

(We have no need for the third coset here.) Using the representatives <7! and 
o 2 , we obtain for the product of the two cosets 

|_0 L | H -[f2j H =l^2j H =llj„. 

On the other hand, if we use the representatives t 3 and t 2 , the product of 
the same two cosets becomes 

—and, of course, |_fJ H = {e, r t } / {<r 2 , r 2 } = I^Jh • 

Thus, the operation of multiplication for left cosets is not always well 
defined, and it behooves us to find conditions under which the operation will 
be well defined. For this, it is convenient to make some technical preparations. 
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4-4-1. Exercise. We develop some elementary facts about computations 
with subsets of a group; these facts may be considered as part of a calculus 
of sets. 

Let S , T, U denote arbitrary subsets of the multiplicative group G, and 
let us write 

ST = {st\seS,teT}, S' -1 = {j _1 |jeS}. 

In words, ST is the set of all products of an element of S and an element of 
T(in the appropriate order), and S' -1 is the set of all inverses of elements of 
S. For any aeG we have the set { a }, consisting of the single element a ; we 
write aS in place of {a}S —that is, 

aS = {a}S = {as | s e S} 

—and Sa instead of £{ 0 }. Of course, this is entirely consistent with the 
notation we used for cosets. We have: 

(/) (ST)U = S(TU) —in other words, the 66 associative law” holds, and 
we may write STU. The 64 generalized associative law” then holds, 
and we can write any product of sets, say S ± S 2 S 3 ••• S n9 without 
worrying about parentheses. 

(ii) For any aeG, 

aS a aT o S czT o Sa a Ta, 

aS a Ta o S c: aT^Ta o Sa~ i c= a~ l T o aSa -1 c= T. 

(in) (S- 1 )- 1 = 5; S c T o 5" 1 c T ' 1 ; (ST) -1 = 

(iv) For a nonempty subset H of G the following are equivalent: 

(*) H is a subgroup of G, 

(**) HH<=.H and H~ l <= H, 

(***) HH~ X <= H. 

(v) If H is a subgroup, then aH = H = Ha for all ae H, HH = H, and 
H~ x = H. 

(vi) If H is a subgroup, then 

HaH<=.aH o Ha<=.aH. 


It is left for the reader to state and prove the additive analogs of these facts. 
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Returning to the multiplicative group G, subgroup H, and operation of 
multiplication on the set of left cosets of H in G given by [ci\ H • \J>] H = 
\ab \ H , we have: 

The set of left cosets is a group under multiplication 
o multiplication of left cosets is well defined 

o =lfJ H and 1 b' \ H =\b\„ imply | a'b' \ H = \ab\„ 
o a' e |_«J H and b' e |J>J H imply a'b' e | ab | „ 
o a' e aH and V e bH imply a'b' e ( ab)H 
o ( aH)(bH ) c= (< ab)H for all a,b,eG 
o aHbH c= abH for all a,beG 
o HbH a bH for all b e G 
o Hb c= bH for all b e G 
o b-'HbaH for all b e G 
O b~ l H <= Hb _1 for all b e G 
o bH a Hb for all be G (since {Z> -1 } = G — {Z?}) 
bHb~ l a H for all b e G. 

Furthermore, using the fact (proved above) that 

bH a Hb for all b e G o Hb a bH for all beG , 

we have 

bH a Hb for all beG o bH = Hb for all beG 

o bHb~ l = H for all beG. 

It is time to summarize what we have accomplished. 

4-4-2. Theorem. Suppose H is a subgroup of the multiplicative group G . 
Then the following conditions are equivalent: 

(i) bHb _1 c= H for all beG, 

( ii) bHb- 1 =H for all beG, 

(in) bH = Hb for all beG. 

When any one of these conditions holds, H is said to be a normal subgroup 
of G. 

Moreover, the set of left cosets becomes a group with respect to the 
operation [a\ H • \_b\ H = 1 ab \ H if and only if H is normal. When this is 
the case, the group of cosets is known as the factor group or quotient 
group; it is denoted by GjH. The identity element is [£_\ H = H an< i the 
inverse of [aj H is [aj^ 1 = 1 a" 1 . 
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4-4-3. Remarks. (1) To show that a subgroup H is normal, one usually 
verifies the conditions bHb -1 c= H. After all it is easier to prove an inclusion 
relation for two sets than to prove equality of two sets (since proving equality 
of sets often involves showing that each one is contained in the other). 

(2) The condition bH = Hb says that for a normal subgroup the left and 
right cosets are the same—more precisely, for any be G, the left coset to 
which it belongs is the same set as the right coset to which it belongs. Of 
course, as seen in our discussion the condition could be replaced by either 
of the seemingly weaker conditions 

bH c= Hb for all b or Hb a bH for all b. 

(3) If H is a subgroup of G, then clearly the condition for the right cosets 
of H in G to form a group is that H be normal. This may be seen, mechanically, 
by rewriting our entire discussion in terms of right cosets. Another way to 
convince oneself of this fact is to note that the condition bH = Hb for 
normality indicates that the stories for left cosets and right cosets are “ sym¬ 
metrical.” 

(4) Suppose H is a normal subgroup of G, then for a 9 b e G the expression 
( aH){bH ) has two possible meanings. In the first place, this may be viewed 
as the product of the two subsets aH and bH of G—so 

(aH)(bH) = aHbH = abHH = abH. 

In the second place, ( aH){bH ) may be viewed as the product of two cosets as 
elements of the factor group GjH —so 

(aH)(bH) = [oJ ff • = \ab] H = abH. 

Thus, the two interpretations are compatible in all ways. 

In similar fashion, the two interpretations of ( aH ) _1 do not conflict— 
namely, as a set, 

{aHY 1 = H~ 1 a~ 1 = Ha _1 = a^H 
while, as an element of GjH , 

(«">-' -l-1; 1 

(5) It is left to the reader to state the additive version of 4-4-2. 

4-4-4. Examples. (1) For an arbitrary group G the two trivial subgroups 
G and {e) are obviously normal. 

(2) With notation as in 4-2-12, let G = S 3 = {e , cr l9 o 2 , t 19 t 2 , t 3 }; then 
the subgroup A = {e 9 o u o 2 } is normal—in fact, the computations done in 
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4-2-12, part (1), guarantee that for every p e G we have pA = Ap. The factor 
group G/A consists of two elements—namely, the two cosets 

[fj a = A = {e, a u a 2 ) and \tt\ A = r t A = {t u t 2 , r 3 }. 

On the other hand, the subgroup H 1 — { e , tJ of S 3 is not normal—we 
do not have pH l = H x p for every peG since, in particular, 

g 1 H 1 = {o u t 3 } is not equal to H 1 g 1 = {g u t 2 }. 

Similarly, the subgroup H 2 = {e,r 2 }isnot normal (because g 1 H 2 = {g u ^ 
{g u t 3 } = H 2 g i), and the subgroup H 3 = {e , t 3 } is also not normal (because, 
as the reader can check easily, g 1 H 3 ^ This settles completely the 

question of which subgroups of S 3 are normal. 

(3) For an arbitrary group G, we have introduced its center 

3 = {a e G | ax = xa for all xe(?} 

in 4-2-7. Is the subgroup 3 normal ? By definition of 3, any be G commutes 
with any a e 3. Consequently, 

33 = 3 b for all b e G 

—so the center is always a normal subgroup. 

(4) If the group G is abelian, then surely every subgroup H is normal 
and we may form the factor group G\H. For example, suppose G is the 
additive group of integers Z. Then any subgroup H = m Z, m > 1, is normal, 
and we can form the factor group Z/m Z. Its elements are the cosets a + m Z 
of m Z in Z, and these are simply the residue classes of Z modulo m. Thus, 
Z/ m Z tonsists of the m objects [0j w , J_JJ m ,..., \m-l \ m —and by the way 
addition is defined in Z/m Z (namely, Mm +lA)m = \ a+b L ) it is clear that 
the quotient group Z/m Z is the same as the additive group { Z m , + } of the 
ring { Z m , +, *}. We shall write 

Z/m Z—Z m (as additive groups). 

We turn to a discussion of mappings of groups, and shall soon see how 
they are related to normal subgroups and factor groups. 

4-4-5. Definition. Suppose groups G and G' are given and, for convenience, 
let them both be multiplicative. A mapping </>: G -> G' is said to be a homo¬ 
morphism when 

<t>(ab) = mm 

for all a,beG. In addition: If $ is surjective (that is, onto) we call it an 
epimorphism; if </> is injective (that is, one-to-one) we call it a monomorphism; 
if (j) is surjective and injective (that is, an epimorphism and a monomorphism) 
we call it an isomorphism and write G « G\ 
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These definitions are entirely in the same spirit as those given for rings 
in 2-5-6. In general, when we are concerned with two algebraic objects of 
the same type, a mapping of one into the other is said to be a “homo¬ 
morphism” when it preserves all the corresponding operations. A homo¬ 
morphism which is surjective and injective is said to be an “isomorphism”; 
in other words, the term isomorphism signifies that the two algebraic objects 
should be considered to be the “ same.” 

The basic properties of a homomorphism of groups are analogous to 
those of a homomorphism of rings. For example, in analogy with 2-5-8 and 
2-5-9, we have: 


4-4-6. Proposition. Suppose G is a multiplicative group with identity e , 
and G' is a multiplicative group with identity e'. If </>: G^> G f is a homo¬ 
morphism then: 

(1) m = 

(2) 4>{a~ l ) = for all aeff, 

(3) 0(o") = May for all aeG,ne Z, 

(4) </>((?), the image of </>, is a subgroup of G\ 

(5) if we define the kernel of </> (and denote it as: ker (/>) by 

ker (j) = {a e G \ M a ) — e '}> 

then ker </> is a normal subgroup of G , 

(6) M a ) — <t>(b) e ker <j> o a~ 1 b e ker </>, 

(7) <j> is injective o ker </> = ( e ), 

(8) if <j> is injective, then </> is an isomorphism of G onto </>(G). 


Proof : (1) M e ) = </>(e 2 ) = which implies M e ) = 

(2) e' = M e ) = which says that M a ~ l ) is the 

inverse of in <7; so M a ~ l ) = 

(3) = (t>(a) n holds for n — 0 [as M e ) = e ’\ for n — 1, for n — 2 

[as M a2 ) = and by induction it holds for all « > 0. Then 

for any n < 0 we have — n > 0 and 

<ko = ^[(a- 1 )-"] = ok*- 1 ))-" = (<K«rT n = m n - 

(4) Consider 0 ', V e </>(G)—so 0 ' = M a )> b' = for some a,beG. 

Then = </>(a)</>(&) = M a b) e </>(G) and (a') -1 = = <K<z _1 ) e </>(£)• 

Hence, </>((?) is closed under multiplication and taking of inverses, so it is a 
subgroup of G'. One expresses this fact by saying: A homomorphic image of 
a group is a group. 
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(5) Write N — ker 0. Then, a, b e N means that 0(a) = e\ 0(6) = e\ and 
we have 


0(a6 x ) = 0(a)0(6 x ) = 0(a)0(6) 1 = e’e ' = e' 

—so ab~ l g N and Nis a subgroup. Toprove it is normal, we showaAfa -1 c= N 
for every a e G. In fact, for any ce N and any aeG we have 

4>{aca~ l ) = 0(a)0(c)0(a) _1 = 0(a)e'0(a) -1 = e' 

so aca~ l e N and indeed aNa _1 c: N. 

(6) Still writing N = ker 0, we have 

o 4>ifl)4>{by x = e 9 
o 0(a6 -1 ) = e' 
o ab^eN. 

The other condition arises from the fact that N is normal—in detail 

ab~ l e N o ae Nb 
o a e bN 
o b~ i a e N 
o ( b~ 1 a )~ 1 e N 
o a~ x b g N. 

Our formulation of (6) probably hides its full significance; another way 
to express (6)—one which comes closer to the mark—is: The elements a and 
b have the same image under 0 they belong to the same coset of N = 
ker 0 in G. 

(7) If 0 is injective and a g ker 0, then 0(a) = e’ — 0(e); so a = e because 
0 is injective, and hence ker 0 = (< e ). 

Conversely, if ker 0 = (< e ), then using (6), 

0(a) = 0(6) o ab~ l g ker 0 = (e) o a = b 
—so, 0 is injective. 

(8) 0 is an injective homomorphism of G -► <7, so using (4), the homo¬ 
morphism 0: 0(G) is injective and surjective. | 

Incidentally, if the group G is multiplicative while G f is additive, then the 
condition for a homomorphism takes the form 

0(a6) = 0(a) + 0(6). 

Of course, results such as 4-4-6 remain valid—with trivial modifications of 
notation. 
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4-4-7. Proposition. Suppose </>: G -► G' and \j/: G' G" are homo- 

morphisms of groups; then 

(1) i// © (/>: G -► G" is a homomorphism. In words; the composition of 
homomorphisms is a homomorphism. 

(2) If both </> and \j/ are surjective, so is \j/ ° </>. 

(3) If both </> and ij/ are injective, so is \// © 0. 

(4) If both </> and \j/ are isomorphisms, so is \// © </>. In words: The com¬ 
position of isomorphisms is an isomorphism. 

(5) If </> is an isomorphism, then </> _1 exists and is an isomorphism. 


Proof : The details go like those of 2-5-10, so they are left to the reader. 

The reader will also note the following immediate consequence of these 
facts: 

4-4-8. Proposition. Isomorphism is an equivalence relation on the set of 
all groups. 

4-4-9. Examples. (1) Consider the symmetric groups S 4 and S 5 . The 
elements of S 4 may be viewed as all the permutations of the set {1, 2, 3, 4}, 
and the elements of S 5 may be viewed as all the permutations of the set 
{1, 2, 3, 4, 5}. Let us define a mapping 

4 >: S 4 -> S 5 

as follows: For g e S 4 let = <j' e S 5 be the permutation whose action is 

<r'(l) = <r( 1), a\2) = d(2), d'(3) = o{ 3), *'(4) = d(4), d'(5) = 5. 

In other words, G f is the same as a on (1, 2, 3, 4} and it keeps 5 fixed. 

Obviously, </> is a homomorphism. It is injective since </>(cr) — g' = e e S 5 
implies (r(l) = 1, cr(2) = 2, cr(3) = 3, cr(4) = 4, so cr is the identity of »S 4 ) but 
not surjective (for example, the element (! \ 3 5 1) °f S 5 is not in the 

image of </>). Thus, according to 4-4-6, part ( 8 ), S 5 contains an isomorphic 
copy of S 4 —namely, the image of S 4 under </>. 

In the same way, for any n > 1, S n+i contains an isomorphic copy of 
S n (that is, we can produce an injective homomorphism of S n into S n+1 ) so 
S n may be viewed as a subgroup of S n+i . 

(2) Consider the additive groups Z OT ,and Z n where n | m. Define a mapping 

4* • z„ z n 

by putting 

aeZ ■ 
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In the first place, </> is well defined—that is, | a' \ m = | a | , n implies </>( ! a' | m ) = 
111 J—since 

=>n\{a'-a)^\a!] n =\a\ n 
Now, 0 is a homomorphism—since 

*(l£j« +IAU = H\ a + b l) = \ a + b l 

=K + IAJ„ = *(K.) + ‘Kill- 

Obviously, </> is surjective. As for the kernel of </>, 

[nj m 6ker^=>^(|aJ m )=[0j B 

=>« | 0 . 

Hence, if we write m = nd, then ker </> consists of the d elements 0, n , In , 
3«,...,(*/ — 1)« of Z m ; of course, these form a (normal) subgroup of Z m . 

(3) Consider the additive group of reals R = {R, + } and the multiplicative 
group of positive reals R > 0 = {R > 0 > *}• Consider the map 

log: R > 0 -► R 

where “log” refers (for simplicity) to the logarithm to the base e. We recall 
from elsewhere that the logarithm is defined only for positive real numbers 


Y 



x 
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and its graph is shown in the accompanying figure. The crucial and well- 
known algebraic property of log (which we take for granted here) is 

lo g(ab) = log a + log b 

for all positive reals a , b. In other words, log: R> 0 -► R is a homomorphism. 
Is it injective? surjective? an isomorphism? An elegant way to settle these 
questions is to define a function 

exp: R-► R >0 

by putting, for all xeR, 

exp(x) = e x . 

The graph of this function is shown in the accompanying figure, and exp x 


Y 



is always in R >0 . Of course, exp is a homomorphism, since for x, ye R 

exp(x + y) — e x+y = e x e y — (exp x)(exp y) 

(the standard rule for exponents, e x e y = e x+y , being taken for granted here). 
Now, let us examine the composition of these maps. For a e R >0 , 

(exp lo g)a = exp(log a) — e loga = a 


and for b e R 


(log exp)Z> = log(exp b) = \og(e b ) = b. 
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Thus, exp © log is the identity map of R> 0 , and log © exp is the identity map 
of R; so log and exp are inverses of each other—exp = log -1 and log = exp -1 . 
In particular, both log and exp are 1-1 and onto—so they are isomorphisms. 

Of course, when logarithms were introduced several hundred years ago, 
it was for the purpose of replacing multiplication of numbers by the 66 easier ” 
(in terms of computational effort) operation of addition. The fact that log 
is an isomorphism makes it all legal. 


4-4-10. Theorem (Cayley: 1821-1895). Any group G is isomorphic to a 
group of permutations. 


Proof: Suppose the group G = { a , b, c ,...} is multiplicative. Viewing G 
as a set, let S G denote the group of all permutations of this set. The operation 
in S G is composition, ©. We shall show that G is isomorphic to some subgroup 
of S G by exhibiting an injective homomorphism of G into S G —and, in this 
way, G will be isomorphic to a group whose elements are permutations, 
which is what Cayley’s theorem asserts. 

For each ae G, define a mapping 

L a :G^G 

by putting 

L a (x) = ax 9 xeG. 

In other words, L a is left multiplication by a in G. Now, L a is a permutation 
of G —that is, L a e S G —since: 

(/) given be G there exists xeG for which ax = b, so L a is surjective, 
(ii) L a (x t ) =L a (x 2 ) => ox l = ox 2 => x l = x 2 , so L a is injective. 

Now, define a mapping 

4>:G^S g 


by putting 

4>(a)=L a , aeG. 

To prove that <j> is a homomorphism, we observe first that for a,beG , 
L a o L b = L ab [since for any xeG, (L a °L h )(x) = L a (L b (xj) = L a (bx) = 
abx = L ab (x)\. Consequently, for any a, be G, we have 

ftab) = L A = L a oL b = <m°m> 

and <j> is a homomorphism. 



452 


IV. GROUPS 


It remains to show </> is injective—that is, ker $ = (e). To do this, suppose 
a e ker </>. Then (/>(a) = L a is the identity element of S G —that is, L a is the 
permutation of G which leaves every element fixed. In other words, 

L a (x) = x for every xe G, 

which says that ax = x for every xe G. This clearly implies a — e. The proof 
is now complete. | 

The next order of business is to expose the connection between normal 
subgroups and homomorphisms. Roughly speaking, given a normal sub¬ 
group TV of G there exists a “canonical” (meaning “standardized”) homo¬ 
morphism whose kernel is TV. On the other hand any homomorphism of G 
into some group determines a normal subgroup TV of G, namely its kernel. 
In detail, we have: 


4-4-11. Proposition. Suppose TV is a normal subgroup of G, then there 
is a canonical homomorphism 

n: G-> G/N 


defined by 

n(a) = |a_| N = aN. 

In fact, n is a surjective homomorphism whose kernel is TV. 


Proof : Of course, G/N is the factor group, as described in 4-4-2. The map 
% is the most natural mapping of G -► G/N —it maps each element of G to 
the coset which it determines. 

Now, n is a homomorphism—since for any a,beG we have 

n(ab) = \ab\ N =|o_| A , • = n(a) ■ n(b). 

Furthermore, n is surjective—for given an arbitrary element of G/N it is of 
form [xj N for some xeG , and then n(x) = \_xj N . 

As for the kernel of n : since the identity of G/N is \_£] N = TV, we have 

a e ker n o n(a) = | e | N 
o a e TV. | 


4-4-12. Isomorphism Theorem. Suppose the mapping </>:G->G' is a 
surjective homomorphism. If its kernel is TV, then the factor group G/N 
is isomorphic to G' —that is, 

G/TV«G'. 
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Proof : The kernel N is a normal subgroup of G, so the factor group 
G/N exists. We define a mapping 

G/A->G' 

by putting 

= </>(«)• 

Because the definition of <fi depends on the choice of representative for the 
coset we must show that </> is well defined. This follows from 

=> a~ l b e N 

=> = e\ the identity of G' 

=> <K<*) = 4>{b) 

= *(l*J w )- 

Now, $ is a homomorphism, since 

m* • nw= 

= <f> ( ab ) 

= <k«) • m 

The image of G/7V under <fi is 

tf([«| w ) I l£j w e G ! N ) = W«)I'«^G} 

= G' (since </> is surjective) 

—so </> is surjective. 

As for the kernel of we have 

[fjiv 6 ker $ ** ^(H*) = e ' 
o = e 
o ae ker </> 
o ae N 

o | a | N is the identity element of G/N. 

Hence, $ is injective—so <fi is an isomorphism of G/N onto G'. | 
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A picture commonly associated with this result is 

</> 

G- 




G/N 

and one says that the diagram “commutes” (more accurately, “is com¬ 
mutative”) because of the relation 

(j) O 7T = (j). 


We content ourselves here with a single application of the isomorphism 
theorem. Suppose G = [a\ is an arbitrary cyclic group (written multiplica- 
tively). Define a mapping 

</>: Z -> G 


by putting 


4>{ri) = a n . 


For any n l9 n 2 e Z we have 

</>(«! + n 2 ) = a ni + " 2 = a" 1 a” 2 = </>(«i)^(« 2 ) 

so <j> is a homomorphism of the additive group of integers, { Z, +}, into G. 
Furthermore, </> is surjective because 

image (j) = {(j>{n) \ne Z) — {a n | n e Z} = [ 0 ] = G. 

Now, consider N = ker </>. If A = (0), then </> is injective—so it is an 
isomorphism of { Z, +} onto G. On the other hand, if N ^ (0), then N is of 
form m Z for some m > 1, so </> is a surjective homomorphism of Z -► G with 
kernel N — mZ and the isomorphism theorem tells us that the factor group 
Z/m Z is isomorphic to G. But Z/m Z is really the additive group of the ring 
Z m [as noted in 4-4-4, part (4)], so we have 


{ z m> +} ~ G. 


Let us summarize: 


4-4-13. Proposition. Suppose G is a cyclic group; then 

G « Z if #(G) = 00 , 

G « Z m if #(G) = m. 


In particular, any two cyclic groups of the same order, are isomorphic. 
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Proof : According to the above, G is isomorphic to Z or to Z m for some 
m > 1. Clearly, the first case (that is,.G « Z) occurs if and only if #(G) = oo, 
and the second case occurs if and only if #(G) < oo. Now, if G « Z m , then 
# (G) = # ( ZJ = m (because isomorphic finite groups have the same 
number of elements), so if G is cyclic of order m it must be isomorphic to 

z w . 

Finally, consider two cyclic groups G and G' of the same order. If # (G) = 
#(G') = oo, they are both isomorphic to Z; if #(G) = #(G') = m, they are 
both isomorphic to Z m . In either case, G « G'. | 

4-4-14. Exercise. Let G = {a, b, c, ...} be a group with identity e , and 
let Sc = {(j, t, p, . ..} be the group of all permutations of the set G. The 
operation in S G is composition of mappings, and let us denote the identity 
of S G by 1. 

(1) Any isomorphism of G onto itself is known as an automorphism of 
G. If 2l G denotes the set of all automorphisms of G (note that 1 e 5l G ), then 
9I G is a subgroup of S G ; it is known as the automorphism group of G. 

(2) For each aeG define the mapping 

I a : G-> G 


by putting 


I a (x) = axa *, xeG. 

This mapping I a is known as conjugation by a. Show that I a is an auto¬ 
morphism of G; it is called the inner automorphism determined by a . 

(3) Any element of form I a {x) — axa~ x is said to be a conjugate of x. 
If H is a subgroup of G, then I a (H) = aHa~ l is a subgroup of G, known as a 
conjugate of H. If we put 


N = f] aHa 1 9 

aeG 

then A is a normal subgroup of G. In addition, H is a normal subgroup of G 
if and only if every conjugate of H is equal to H. 

(4) Let c/ G denote the set of all inner automorphisms of G; then J G is a 
normal subgroup of 2l G —it is known as the group of inner automorphisms. 
(Incidentally, an automorphism of G which is not inner is called an outer 
automorphism.) Denoting the center of G by 3 G , we have (by considering 
the mapping a -► I a ) 


G/3 C ~«/ G . 
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4-4-15 1 PROBLEMS 

1. Consider the octic group G = {e , cr l9 <j 2 , cr 3 , r l9 r 2 > Pi> P 2 ) as discussed 
in 4-1-10 and 4-2-6. Decide which of the following subgroups are normal: 

(i){e,o 2 }, (ii) {<?, Tj}, 

(Hi) {e, t 2 }, (id) (e, pj, 

(d) {<?, P 2 }. (»0 {<?, ffi, <r 2 » 

(w7) {<?, p x , p 2 , «t 2 }, (Di'ii) {<?, x u x 2 , a 2 }. 

Are there any other proper subgroups of the octic group ? 

2. Exhibit groups Gd H zd K such that K is normal in H 9 but K is not 
normal in G. Can this be done if H is also required to be normal in G? 

3. If N is a normal subgroup of the finite group G 9 what is the relation 
between the orders of the three groups G/N, G 9 N1 

4. Show that the intersection of any collection of normal subgroups of a 
given group G is a normal subgroup of G. 

5. Let G = S x be the group of permutations of the set X 9 and for a fixed 
yeX consider the subgroup H = H y — {o e G | ay = y}. If X has more 
than 2 elements, show that H is not normal in G. 

6. If H is a subgroup of G of index 2 (that is, ( G:H) = 2), show that H 
must be normal. 

7. If N is a normal subgroup of G and H is an arbitrary subgroup of G, 
prove that H n N is a normal subgroup of H. 

8. For any elements a, b in the group G, let us write 

[a 9 b\ = aba~ i b~ i 

and call it the commutator of a and b . Let C denote the set of all finite 
products of commutators; then prove: 

(/) C is a subgroup (known as the commutator subgroup or the derived 
group) of G; in fact, C is the subgroup generated by^the set of all 
commutators. 

(//) C is normal in G. 

(hi) The factor group G/C is abelian. 

9. Exhibit two subgroups H and K of G = S 3 such that HK is not a sub¬ 
group. 

10. Let H and J^be subgroups of G; show that: 

(/) HK is a subgroup of G o HK = KH. 

(ii) If H is normal, then HK is a subgroup. 

(iii) If both H and K are normal, then HK is a normal subgroup. 
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11. Suppose H and N are normal subgroups of G with H r\ N — (e)\ show 
that any element of H commutes with any element of N (that is: 
ae H,b e N=>ab = ba ; equivalently, as H,be N=>aba~ l b ~ 1 = e). 

12. For elements x,yeG let us write 

x~ y o y = axa~ l for some a e G. 

In words, x ~ y means that y is a conjugate of x. Show that ~ is an 
equivalence relation. 

The equivalence class to which x belongs consists of a single element 
(namely, x itself) x is in the center of G. 

13. Suppose </>: (?-► G' is an isomorphism of groups. Prove that 

(i) G is abelian o G' is abelian, 

(ii) G is cyclic o G' is cyclic. 

14. Suppose </>: G-> G' is a surjective homomorphism. Show that 
(/) G is abelian => G' is abelian, 

(ii) G is cyclic => G' is cyclic. 

In other words, a homomorphic image of an abelian group is abelian 
and a homomorphic image of a cyclic group is cyclic. Are the converses 
of these statements true? Justify your answers. 

15. For i = a/ — 1 the mapping n -► i n is a homomorphism of { Z, + } into 
{C*, •}• What is the image ? What is the kernel ? 

16. Suppose A is a normal subgroup of G. Prove: 

(0 if G is abelian so is G/N, 

(ii) if G is cyclic so is G/N. 

Why is this problem essentially the same as Problem 14? 

17. Suppose the group G has exactly one element of order 2—call it a . 
Show that a is in the center of G. 

18. In 4-4-9, part (1), we indicated the existence of an injective homo¬ 
morphism of S n ->S n+i . Find as many such injective homomorphisms 
as you can. 

19. The additive group of integers, {Z, +}, is a normal subgroup of the 
additive group of rationals, {Q, +}. Every element of the factor group 
Q/ Z has finite order (show how to find the order) but the group has 
infinite order. Prove these assertions. 

20. (0 The 66 absolute value ” z -► \z\ is a homomorphism of C* (the multi¬ 

plicative group of nonzero complex numbers) into R* (the multiplica¬ 
tive group of nonzero reals). What is the image? What is the kernel? 
(ii) The circle group W = {z e C | |z| = 1} (see 4-1-8) is a normal sub¬ 
group of C*. Describe the factor group C*/W. 
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21. If G is cyclic, then for any subgroup N both N and G/N are cyclic. Show 
that the converse is false; in other words, the normal subgroup N and 
the factor group G/N can both be cyclic without G being cyclic—as a 
matter of fact, G need not even be abelian. 

22. Prove that if p is prime, then, upto isomorphism, there exists a unique 
group of order p —namely, the cyclic group of order p. 

23. (/) The following three groups of order 4 are isomorphic: 

(a) { Zf, •}, whose elements are 1, 3, 5, 7, 

(b) the four group (as defined in 4-1-12, Problem 16), 

(c) the direct product G t x G 2 (see 4-1-12, Problem 26) of two 
cyclic groups G 1 and G 2 of order 2. 

Prove this by producing explicit isomorphisms. 

(ii) Another group of order 4 is the cyclic one—say { Z 4 , +}. It is not 
isomorphic to the groups in (/). 

(iii) There are no other groups of order 4; in other words, a group of 
order 4 is isomorphic to the four group or to { Z 4 , +}. In particular, 
any group of order 4 is abelian. 

24. The symmetric group S 3 and the cyclic group { Z 6 , + } are nonisomorphic 
(that is, “distinct”) groups of order 6. Show that, upto isomorphism, 
there are no other groups of order 6. 

25. Suppose </>: G -► G' is a homomorphism and a e G. Prove: 

(/) If (j) is an isomorphism, then ord (c/)(a)) = or< i a * I n fact* the same 
conclusion holds if </> is injective. 

(//) If </> is surjective, then ord (c/)(a)) divides ord a. 

26. The groups { Z 10 , +} and { , •} are both cyclic of order 10. Can you 

find an isomorphism 

0: Z 10 ^ Z* 

for which 

(0 0(1) = 2, (ii) 0(1) = 3? 

Find as many isomorphisms of Z 10 -> as possible. 

27. If p is prime, the groups {Z p _ l9 +} and {Z*, •} are isomorphic. How 
would you go about finding all isomorphisms between them ? 

28. Suppose G = [a] is cyclic, then prove: 

(/) If cj ): G -► G' is a homomorphism it is completely determined once 
we know cj)(a). 

(ii) Consider any a' e G'. If we put ij/(a) = a! this determines a mapping 

ij/: <7 defined by iji(a n ) = (\ji(a)) n = (aj for all n e Z. What 

condition must a ' satisfy in order that \j/ be a homomorphism? Is \j/ 
really a mapping? 

(iii) Apply this information in Problems 26 and 27. 
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29. Show that the only inner automorphism of an abelian group is the identity 
map. 

30. Consider each of the following groups, G, 

(/){Z,+}, («){Z 7 ,+}, 

(iii) the cyclic group of prime order /?, with multiplication as the operation 
and generator a, 

(iv) the four group, (v) the symmetric group, S 3 , 

(vi) the cyclic multiplicative group [a] of order n. 

In each case, describe all the automorphisms; how many are there?; 
which ones are inner automorphisms?; what is the structure of the 
group of automorphisms, 91 G ? 

31. By an endomorphism of a group G we mean a homomorphism of G into 
itself. If G is cyclic of order n, describe all endomorphisms of G. How 
many are there ? 

32. If the groups G and G' are isomorphic then their automorphism groups 
9I G and 91 G ' are isomorphic. In addition, there is a 1-1 correspondence 
between 9I G and the set of all isomorphisms of G -► G *—in fact, if <j>: 
G -► G' is a fixed isomorphism then 

<7 -> 0 ° <7, g e G 

is the desired 1-1 correspondence. 

33. The group {Zf 5 , •} has </>(15) = 8 elements. The element 2 has order 4, 
and the element 11 has order 2. If we write G l = [2] and G 2 = [11], these 
are cyclic groups of orders 4 and 2, respectively; show that Z * 5 is iso¬ 
morphic to the direct product G 1 x G 2 . 

34. Suppose H and N are subgroups of G with N normal. Then H n N is a 
normal subgroup of //, and HN is a group in which N is a normal sub¬ 
group. The mapping of 


H -► HN/N 


defined by 


a -► aN , a e H 


is a surjective homomorphism with kernel H n N. Consequently, 

H/(H r\N)x HN/N. 


This is known as the first (or second if one counts 4-4-12 as the first) 

isomorphism theorem. 
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4-5. Permutation Groups 

According to Cayley’s theorem, 4-4-10, any group may be viewed as a 
group of permutations—that is, as a subgroup of the symmetric group S x 
for some set X. Consequently, to study all possible groups, it suffices to 
investigate all symmetric groups and their subgroups. Historically, people 
were studying “groups of permutations” long before the axioms for a group 
were formulated. Thus, the content of this section (which is not needed for 
the sequel) is designed primarily to enlarge our knowledge of the symmetric 
group S n and its subgroups. In particular, we will be obtaining information 
about an arbitrary finite group (since, as seen in the proof of Cayley’s theorem, 
a group of order n is isomorphic to a subgroup of S n ). However, the reader 
should keep in mind that experience has shown that it is often better to deal 
with an abstract group rather than tie oneself down to a fixed concrete 
realization (that is, isomorphic copy) of the group. 

4-5-1. Notation. We recall that S n = {<j, t, p, ...} is the group of all 
permutations of the set X — {1, 2, ..., n} 9 with the operation of multiplica¬ 
tion (that is, composition) written as either a o t or err. Of course, g e S n 
means that a is a mapping of X -► X which is 1-1 and onto, where, as is 
customary, a is 1-1 means 

for any x, y e X 9 gx = ay => x = y 


and g is onto means 


given any yeX there exists xe X such that gx — y. 

We introduced a natural notation for elements of S n in 4-1-11; here we 
shall modify and simplify the notation. The full story will be clear as soon as 
we treat the following elements of S 9 . 


(/) Consider the permutation 


/I 2 3 4 5 6 7 8 9\ 
\4 5987 1 63 2/ 


We start with the fact that g maps 1 —> 4. It also maps 4 -► 8 —and continuing 
this procedure of applying g to the image, we have in addition: 8 -> 3, 3 -> 9, 
9 -> 2, 2 -> 5, 5 -► 7, 7 -► 6 . At the next stage, we have 6 -> 1, so the action of 
a may be described by the “ diagram ” 


1 —>4—> 8 —>3—>9—>2—>5—>7 —>6 



4-5. PERMUTATION GROUPS 


461 


We express this more compactly by writing 

(j = (148392576) 

—the interpretation being obvious. 

There is no particular reason for starting with 1 and pursuing its successive 
images under o. Any elements 1, 2, ..., 9 would do just as well. For example, 
starting from 8, we arrive at 


(j = (839257614) 


which is another (obviously “ equivalent ”) symbol for the permutation <7. 
(//) Consider the permutation 


/I 2 3 4 5 6 7 8 9\ 

\3 79 12586 4/' 

Starting from 1, the action of a gives: 1 -►3->9->4, and since 4-> 1, we 
should obviously write (1394) for 



3 -> 9 -> 4 


However, this gives no information about what <7 does to 2, 5, 6, 7, 8. Con¬ 
sequently, we start from 2, say, and obtain 


2->7->8->6->5J 


The way to denote a is then 


(7 = (1394)(27865). 


Among the other ways to denote this same a one has: 

(4139)(27865), (1394)(86527), (78652)(9413), (27865)(1394), 
and so on. 

(in) In the same spirit, the symbol 

(182)(347)(596) 

represents the permutation under which 



—in other words, the element of S 9 under consideration is 

/I 2 3 4 5 6 7 8 9\ 

\8 1 4 7 9 5 3 2 6/ 
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Obviously, this may also be denoted by (821)(347)(659), (734)(659)(218), 
and so on. 

(iv) Consider the permutation 


/I 2 3 4 5 6 7 8 9\ 
(7 9 1 85324 6/ 


According to our procedure this can be written as 

(j = Q72963)(48)(5). 

Note: The appearance of (5) signifies that o5 — 5—that is, 5 is kept fixed 
by a. We shall write 

(j = (172963)(48) 


under the general convention that whenever an integer (from 1,2, ...,9) 
does not appear it is understood to be fixed under the permutation. Thus, for 


/I 2345678 9\ 
[l 4 2 1 8 6 3 9 5) 


we write (2473)(589) instead of (1)(2473)(6)(589). 

(v) According to our conventions 

(7 = (375)(29) 

is shorthand notation for 


while 


/I 2 3 4 5 6 7 8 9\ 

\1 9 7 4 3 6 5 8 2) 


is denoted by 


/I 2 3 4 5 6 7 8 9\ 
\3 2915678 4/ 


t = (1394). 


(vi) How one writes the identity permutation of S 9 , 


/I 2 3 4 5 6 7 8 9\ 
\1 2345678 9/’ 


is a matter of taste. When there is no way out, we will write e = (1). 


4-5-2. Remark. Without being overly precise, let us indicate some 
properties in S n which are immediate consequences of our new notation. 
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Suppose a l9 a 2 , ..., a r are distinct elements, chosen in any order, from 
the set X = {1, 2, ..., n ); then (a u a 2 ,..., a r ) is said to be a cycle of lengths, 
or simply an r-cycle. This terminology is natural because (a i9 a 2 ,..., a r ) 
represents the element a of S n whose action is described by 


-+a r < 




a i ^a 2 


—where it is understood that the remaining elements of X are fixed under a. 
The use of the word “cycle” is based on the fact that the elements are 
permuted in “circular” or “cyclical” fashion, and we have 

(a i9 a 2 ,..., a r ) = ( a 2 , a 2 ,..., a r , a^) = * • • = (a r 9 a l9 ... 9 <z r _i). (*) 

Cycles of length r exist for any r satisfying 1 < r < n. Any 1-cycle is of form 
(fli), so it leaves every element of X (including a t ) fixed—that is, it represents 
the identity permutation. (Clearly, 1-cycles are not especially interesting and 
are usually ignored.) Any 2-cycle is of form ( a i a 2 ) 9 a 2 \ it is known as a 

transposition because it represents the permutation which interchanges—that 
is, transposes— a t and a l9 while leaving the remaining elements of X fixed. 
(Transpositions are especially important, as will be seen later.) 

Two cycles o = (a l9 a 2 ,..., a r ) and t = (b i9 b 2 , ..., b s ) in S n are said to 
be disjoint when the sets {a i9 ... 9 a r } and {b l9 ..., are disjoint—in other 
words, when the two cycles have no symbols in common. For example, in 
S 9 , the 4-cycle a = (2473) and the 5-cycle t = (58961) are disjoint, while the 
cycles cr = (2473) and t = (179) are not disjoint. 

If we have several cycles <r i9 o 2 , ..., a t in S n9 we say they are disjoint 
when every pair of them is disjoint. For example, in S 9 , = (57), g 2 = 

(492), <j 3 = (18) are disjoint; but if we take cr 4 = (67), then a l9 a 2 , <j 3 , cr 4 are 
not disjoint. 

According to the procedure illustrated in 4-5-1, every a e S n can be written 
as a product 

O' = , a rt )(bi ,.... b n ) • ■ • (y u y 2 ,..., y rt ) (#) 

of disjoint cycles. In more detail, starting with any a t e X = (1, 2,..., n} 9 
we pursue its successive images under a until we return to a v This gives a 
cycle (a i9 a 2 , ..., a ri ). If this cycle exhausts the elements of X 9 then we are 
finished and cr = (a i9 a 2 , ..., a ri ). If this cycle does not exhaust X 9 choose 
any b { e X which does not appear in the cycle (a i9 a 2 , ..., a ri ). Pursuing the 
images of b x under cr, we obtain a cycle (b l9 b 29 ..., b, 2 ) 9 disjoint from the 
first one. If these two cycles together do not exhaust X 9 choose qel which 
does not appear in either cycle, and keep going. This process stops after a 
finite number of steps (because X is finite) and we finally have an expression 
of form (#) for cr. 
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In the expression (#) for a the lengths of the cycles are r i9 r 2 ,..., r t with 
1 < r l9 r 2 ,..., r t < n. Since each element of X — {1, 2,..., n) belongs to 
exactly one cycle, r 1 + r 2 + • • • + r t (which is the sum of the lengths of the 
various cycles) is equal to n. The expression (#) for a includes cycles of 
length 1, but because such cycles make no contribution they will usually be 
dropped (as was done in the specific examples 4-5-1). In the resulting ex¬ 
pression for g [which we still write in the general form (#)] we have therefore 
r u r 2 , ..., r t < 2 and r 1 + r 2 + • • * r t < n. 

Whichever convention is used with regard to the 1-cycles, the order in 
which the cycles are listed is immaterial. This is clear from the way we obtain 
the expression for a; in particular, if we start with b i (instead of a t ) 9 then the 
left-most cycle of (#) is (b l9 b 2 ,..., b r2 ). Furthermore, each of the individual 
cycles may be shifted—that is, 66 cycled ”—as indicated in (*). 

The conclusion to be drawn from all this is as follows. Any permutation 
can be expressed uniquely as a product of disjoint cycles—it being under¬ 
stood that uniqueness is upto shifts in the individual cycles and the order 
in which the various cycles appear. 

The foregoing discussion is largely 46 intuitive ” but it does convey what 
is going on. For the reader who craves greater precision, and in order to 
clarify the facts, we now sketch a formal approach to the same results. 

Let g e S n be given and write, as usual, X = {1, 2,..., n}. Consider any 
ae X. Let us pursue the successive images of a under g —they are 

a, Ga 9 g(go) = G 2 a 9 g(g 2 o) = G 3 a 9 ..., G m a 9 _ 

The set of all such images is obviously given by 

{G m a\m>0}. 

For any be X 9 let us define the symbol a = ff b by 

a = a b o b = G m a for some m> 0. 

In other words, a = a b o b e {G m a | m > 0}. We assert that (which the 
reader may read any way he chooses) is an equivalence relation. First of all, 
<z= ff a for every ae X 9 since G°a — ea = a. In addition, a = a b and b = a c=> 
b = G mi a, c — G mi b for certain m l > 0, m 2 >0=>c = G m2 (G mi a) = G mi+m2 a 
with m 1 + m 2 > §=>a = a c 9 so is transitive. Finally, the symmetric law 
requires a preliminary comment. Since g belongs to the finite group S n , its 
order is finite—say ord g = s > 1. Thus, g s — e and g • cr s_1 = e —which says 
that <j _1 = a s_1 . Consequently, 

a = ff b => b = G m a for some m> 0 
=> a = G~ m b 9 m> 0 
=> a = (G~ l ) m b, m> 0 
=> a — G (s ~ i)m b with (s — 1 )m > 0 
=> b = ff a 
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Then, because = * is an equivalence relation on the set X — {1, 2,..., n) we 
know that X decomposes into a union of disjoint equivalence classes. 

4-5-3. Definition. The equivalence class (with respect to = *) to which the 
element a belongs will be denoted by . The set [aj* is also known as the 
orbit of a under <r, and other notations for it are orb* a or simply orb a . (Thus, 
any g e S n leads to a decomposition of X as a union of the disjoint orbits of 
cr.) An orbit is said to be nontrivial when it contains more than one element. 
A permutation a e S n , is called a cycle when it has exactly one nontrivial 
orbit; if the number of elements (of X) in this orbit is r, then a is said to be 
an r-cycle. Two or more cycles in S n are disjoint when their orbits (by which 
we mean their nontrivial orbits) are disjoint. 

We shall give illustrations of the definition in a moment, but first let us 
describe carefully what an orbit looks like. Given g e S n , the orbit of any 
as X is a subset of the finite set X 9 so orb* a is a finite set. Which elements of 
X belong to orb* a = |_aj*? For be X, we have (as indicated earlier) 

b e orb* a o b = a a 
o a = b 

(T 

o b — a m a for some m > 0 

o be{G m a\m>0}. 

In other words, 

orb* a — {a m a \ m > 0 } 

and (in keeping with our desires and with the informal 
4-5-2) the orbit of a is obtained by taking the successive 
g . Of course, the set {G m a\m > 0}, being finite involves 
We may also note in passing that because ord g = s> 

(notation as before), we have 

{o m a | m > 0} = { G m a \ me Z} 

—so the relation =* could have been defined, at the start, by 

a= b o b = G m a for some me Z. 

a 

This definition of =* is the more standard one for several reasons, one of 
which is the ease with which one then shows =* to be an equivalence relation. 

Because g s = e we surely have G s a = a, s > 1. Now let d be the smallest 
positive integer for which G d a = a (obviously such an integer d exists). 
Consider the elements 

a, Ga , g 2 q 9 G d ~ l a. 


ideas in 4-5-1 and 
images of a under 
repeated elements. 
1 and <j -1 = a s_1 
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They all belong to orb ff a. Furthermore, they are distinct—for if 
a l a = a j a 9 where 0 < i <j < d — 1, 


then 


G j l a = a with 0 <j - i < d - 1 

which contradicts the definition of d. Finally, consider an arbitrary element 
of orb ff a ; it is of form G m a. Using the division algorithm, we write 

m =qd + r, 0 <r <d 


and consequently, 


G m a = G r+qd a = G r (G qd a) 

= ^ d ) q {a)\ 

= G r [{ G d )(G i )---(G i )(a)\ 

q times 


= G r a. 

In other words, any G m a belongs to the set { a , gq 9 ..., G d ~ i a}. This proves: 


4-5-4. Proposition. Given g e S n9 then for any a e X = {1, 2, ..., n} its 
orbit under g is 

orb ff a — {a, Ga ,..., G d ~ 1 a) 

where d is the smallest positive integer for which G d a = a. The elements 
a, Ga, ..., G d ~ l a are distinct. 


4-5-5. Remarks. (/) In S 9 , consider the permutation 


/I 2 3 4 5 6 7 8 9\ 
a \7 9418536 2/ 

Its orbits are the sets {1, 3, 4, 7}, {2, 9}, {5, 6, 8}; they are disjoint and their 
union is X = {1, 2, ..., 9}. Of course, the order in which the orbits are listed 
is immaterial, as is the order in which the elements in each orbit are listed. 
Similarly, the orbits of the permutation 


/I 2 3 4 5 6 7 8 9\ 

T \l 8 7 4 3 9 2 5 6/ 

are the sets {1}, {2, 3, 5, 7, 8}, {4}, {6, 9}; they provide a disjoint decom¬ 
position of the set X = {1, 2, ..., 9}. 
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Phrasing things in accordance with the notation of 4-5-4, we have (for 
the (7 just mentioned) 

orb, 1 ={1,7, 3, 4}, orb,2 ={2,9}, 

orb, 3 = {3,4, 1,7}, orb, 4 = {4, 1, 7, 3}, 

orb, 5 = {5, 8, 6}, orb, 6 = {6, 5, 8}, 

and so on. Of course, as sets (with the order of the elements ignored) 

orb, 1 = orb, 3 = orb, 4 = orb, 7, 
orb, 2 = orb, 9, orb, 5 = orb, 6 = orb, 8. 

Similarly, we have (for the t just mentioned) 

orb T 1 = {1}, orb T 4 = {4}, orb t 5 = {5, 3, 7, 2, 8}, 

orb T 6 = {6, 9}, orb t 7 = {7, 2, 8, 5, 3}, orb T 9 = {9, 6}. 

(//) The permutation 

/I 2 3 4 5 6 7 8 9\ 

Cl (7 2415638 9/ 

has the orbits {1, 7, 3, 4}, {2}, {5}, {6}, {8}, {9}. Since exactly one of the 
orbits is nontrivial (meaning: has more than one element), a x is a cycle. 
Because the nontrivial orbit has four elements, a 1 is a 4-cycle. The action 
of is given by ^ 

and all the remaining elements of X are fixed, so we write (forgetting for the 
time being what transpired in 4-5-1) 

*1=0,7, 3,4) 

or because the commas can be dropped without danger of confusion 

a x = (1734). 

This describes the action of completely, so the order in which the terms 
appear is crucial. Of course, it is equally valid to write a x as (7341), or (3417) 
or (4173). 

The permutation 

_/l 2 3 4 5 6 7 8 9\ 

\1 9345678 2/ 

has the single nontrivial orbit {2, 9}; so it is a 2-cycle, denoted by 

a 2 = (29) = (92). 



468 


IV. GROUPS 


Furthermore, 

/I 2345678 9\ 

\1 2348576 9/ 

has the single nontrivial orbit {5, 8, 6}. It is a 3-cycle and may be written as 
(7 3 = (586) = (865) = (658). 

(in) We note in passing that, according to the definition 4-5-3, the identity 
permutation eeS 9 (or for that matter eeS n , for any n) is not a cycle; 
moreover, there is no such thing as a 1-cycle. The identity e e S n is the unique 
permutation all of whose orbits are trivial. 

(iv) In case the reader has not noticed, let us explain how <7i, <t 2 ,<7 3 
above are related to a [of part (*)]• Consider the orbit {1, 7, 3, 4} of cr. We 
define a permutation in S 9 whose action on the elements of this orbit is the 
same as the action of <j, and whose action on any element outside this orbit 
is to keep it fixed. In other words, we have 

2—>2, 5—>5, 6—>6, 8—>8, 9—>9, 1—>7, 7—>3, 3—>4, 4—>1, 

which indeed, represents a permutation in S 9 . This permutation (which is a 
cycle) is the one we called cq. 

Similarly, a 2 e S 9 is the permutation (it is a cycle) which agrees with a 
on the nontrivial orbit {2, 9}, and leaves the remaining seven elements of 
X — {1, 2, ..., 9} fixed. Finally, <j 3 is taken as the permutation (that is, 
cycle) which agrees with a on the orbit {5, 8, 6}, and keeps the remaining 
elements fixed. 

In the same way, because the permutation t, introduced in (/), has two 
nontrivial orbits—namely, {2, 3, 5, 7, 8} and {6, 9}—we associate with it 
the two cycles 

x t = (28537) and t 2 = (69). 

There is a general principle involved here. Namely, if any permutation 
<7 ^ e in S n is given then with each of its nontrivial orbits we can associate 
a cycle (whose action on the orbit in question is the same as the action of <j). 
These cycles are obviously disjoint—they are known as the “disjoint cycles 
belonging to <j,” or simply as the “cycles of cr.” 

(v) With t, x l9 t 2 as above, let us look at the product 

tj °t 2 = (28537) o (69) 
in S 9 . As is standard, we write this as 


x t x 2 = (28537)(69). 
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The product x l x 2 e S 9 may be computed directly on the basis of this notation 
by tracing the action of t 2 followed by x t (for example: t 2 : 1 -> 1 and x l : 

1 -> 1, so x l x 2 : 1 -► 1; t 2 : 2 —> 2 and x t : 2 -> 8, so x 1 x 2 :2->$; etc.), or else 
we may use the older more cumbersome notation to obtain 

_/l 2 3 4 5 6 7 8 9\/l 2 3 4 5 6 7 8 9\ 

r ^~[l 8 7 4 3 6 2 5 9^1 2 3 4 5 9 7 8 6] 

_/l 2 3 4 5 6 7 8 9\ 

U 8743925 6} 

= T. 

Thus, x has been expressed as the product of the disjoint cycles x t and t 2 in 

S 9 . 

We may also compute the product 

x 2 r 1 = (69)(28537) 

and find the result to be x 2 x t = x. It is obvious (once one thinks about it) 
that the disjoint cycles x t and t 2 commute with each other. In fact, because 
their (nontrivial) orbits are disjoint, an element belonging to the orbit of one 
of the cycles is left fixed by the other cycle, and consequently the order in 
which x { and t 2 are applied does not matter. 

In similar fashion, with <7, <r u o 2 , a 3 as above, we have 
a x a 2 a z = (1734)(29)(586) 

and this product is equal to a. The order of <j u cr 2 , <j 3 in the product does not 
matter (as the reader can verify easily) because they are disjoint cycles. 

(vi) The preceding discussion permits us to write 

t = (28537)(69), 

<7 = (1734)(29)(586), 

where, in each case, the right side represents a product of disjoint cycles. 
This is in keeping with the notation introduced informally in 4-5-1 and 4-5-2. 
However, there is one important addition. There, the use of parentheses 
served merely as an aid in specifying the action of the permutation in question, 
and writing the parentheses adjacent to each other had no particular signifi¬ 
cance. Here we see that the notation of 4-5-1 and 4-5-2, which looks like a 
product, is, indeed, to be viewed as a product of cycles—and the use of 
parentheses is entirely compatible with this interpretation. 

From these examples we are led to the following general result. 
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4-5-6. Proposition. Fix an integer n > 1; then in S n 

(1) disjoint cycles commute, 

(2) any g ^ e in S n can be expressed as a product of disjoint cycles, 

(3) this 66 factorization ” is essentially unique. 

Proof : In virtue of what has gone before, these assertions are fairly clear 
by now. Therefore, we shall give a dry, formal proof, and leave it as an 
exercise for the reader to fill in any details which he considers to be missing. 

(1) Suppose a u o 2 , ..., G r e S„ are disjoint cycles; we must show that 
their product g x g 2 • • • o r (in the group S n ) is not affected by any rearrange¬ 
ment of the terms. For this it suffices to consider the case r — 2 and prove 

G\G 2 = ^2^1* 

Let be the nontrivial orbit of and (9 2 be the nontrivial orbit of 
<j 2 ; so, by hypothesis, (9 Y n (9 2 =0. For any ae X = {\ 9 2 9 ... 9 n) exactly 
one of three possibilities holds: 

(/) a $ 0 i u (9 2 , 

(ii) ae(9 u 

(iii) a e (9 2 . 

In case (/) we have: o^a = a , o 2 a = a , so u x g 2 (cl) = G 2 G l (a). In case (//) we 
have: a x a e 0 l9 o 2 a = a —so G i o 2 (a) = G t a, G 2 G l (a) — o x a, and G 1 a 2 {a) = 
G 2 ofa). In case (Hi) we have G i G 2 (a) = a 2 a = G 2 o i (a). Consequently, 

cr 1 G 2 (a) = u 2 u x (d) for every ae X 
which says that g x g 2 — o 2 o l . 

(2) Given a # e in S n , let 0 l9 (9 2 ,..., (9 r be its nontrivial orbits. They are 
disjoint. For each / = 1, ..., r define cr f by 

joa 9 for ae(9 i9 
° lCl for a$®i. 

Then, o t e S n ; in fact, o t is a cycle whose sole nontrivial orbit, is (9 { , and whose 

action on is the same as the action of a. The cycles o l9 o 2 , ..., a r are 

obviously disjoint, so [by part (1)] they commute. We assert that 

G —— G j G 2 * G r . 

To see this, consider any ae X. It belongs to exactly one of the orbits—say 
ae(9i. Then 

Gja = a for any j ^ i (since a $ (9j) 

and consequently, 

( ff i ff 2 • • • O',- • • • Ofa) = (^;0-102 ‘' ’ ‘ ‘ - °r)a 

= a t a 


= gq. 
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This proves: a — g 1 g 2 • • • g y —g is indeed the product of its disjoint cycles. 

(3) Suppose that in addition to the above factorization g — g x • • • g y we 
have also the factorization 


G — X^X 2 * * * T s 

into disjoint cycles. Denoting the orbit of x x by (9{, i= l,...,,s, these 
orbits are disjoint, and 

( 91 U ( 92 U * * * U = $1 U $2 U * * * U 

because both of these are equal to 

(9 = {x e X\gx ^ x} 

(that is, (9 is the set of all elements of X which are not fixed under <r). Consider 
(9i, and let a be an arbitrary element of $/. Then a belongs to exactly one 
(9ji we rearrange the notation, if necessary, so that ae(9 v It is not hard to 
see that 

G ±Cl — G Cl — 

Furthermore, because disjoint cycles commute, we have 
G { = o[g 1 2 ••• g\ = x[x l 2 ••• t‘ 

for any integer i > 0—and from the properties of our orbits, 

G[a — G l a = tJ a , i > 0. 

We conclude that 

orb ffl a = orb Tl 0 . 

In other words, (9 1 — (9^. Even more 

G x b — x l b, if b e (9 1 = 

— b — x t b 9 if b $ (9 1 = 0/ 

—so o'! = Multiplying our initial factorizations by erf 1 gives 

N 

O-^O- = cr 2 ••• <T r = T 2 ••• T s . 

Now, an induction leads to r = s and cr 2 = x 2 , ..., g y = x r . This proves that 
the factorization into disjoint cycles is unique upto the order of the factors 
and the fact that when a cycle is to be written explicitly there are several 
equivalent ways to do so—namely, by cycling the entries. | 

4-5-7. Examples. Our simplified notation for permutations is now fully 
justified and under control, so we turn to some concrete examples of its use 
in computations. In S 9 consider the permutations: 

Pi = (15), p 2 = (258), p 3 = (29835), p 4 = (174628), 

(7 = (1734)(29)(586), t = (28537)(69), p = (172963)(48). 
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We start with the product p 1 p 2 — (15)(258). To compute this (meaning: 
to express it as a product of disjoint cycles) requires tracing the action of 
p l p 2 on 1, 2, 3, ..., 9. Now, 1 appears only once, in (15). This signifies that 
1 is left fixed by (258) and is then moved to 5 under (15). In short, we have 
1 -► 5 (under pip 2 ). Then 5 is mapped to 8 under (258), and 8 is kept fixed 
under (15)—so 5-» 8 (under pip 2 ). Then 8 goes to 2 under (258) and 2 is 
fixed under (15)—so 8 -► 2 (under p x p 2 ). At the next step, (258) carries 2 to 5 
and then (15) carries 5 to 1—so 2 -> 1 (under p t p 2 ). Thus, p 1 p 2 includes the 
cycle (1582). Obviously, p l p 2 keeps 3, 4, 6, 7, 9 fixed, since these terms do 
not appear in (258) or (15). Therefore, 

Pl p 2 = (15)(258) = (1582). 

It needs to be emphasized that in computing the product one passes through 
the cycles one at a time, going from right to left, because this is the order in 
which the cycles are applied. 

To compute p 2 p 3 = (258)(29835), we note first that 1,4, 6, 7 do not 

appear, so they are fixed under p 2 p 3 . Then, 2- >9 ->9, 9 —>8 ->2, 

so (29) is one of the cycles appearing in the factorization of p 2 p 3 . 

Furthermore, 3-^5—8, 8—^>3—^3, so (38) is also a factor. Finally, 
5—^2—^5, so 5 is fixed under p 2 p 3 . This shows: 

P2P3 = (258)(29835) = (29)(38). 

In similar fashion, one computes 

p 3 p 4 = (29835)(174628) 

= (174698)(352). 

With a bit of practice the reader should be able to do such products in one 
line, without intermediate steps. For example, 

oxp = (1734)(29)(586)(28537)(69)(172963)(48) 

= (192548)(376). 

How is this done? Starting with 1, its right-most appearance is in (172963), 
under which 1 ->7. Going leftward from (172963), the first appearance of 
7 is in (28537), under which 7 -► 2. Then going leftward from (28537), the 
first appearance of 2 is in (29), under which 2 -► 9. There being no appearance 
of 9 to the left of (29), we conclude that oxp maps 1 -► 9. Next, we apply this 
procedure to 9: under (172963), 9->6; then under (69), 6->9; then under 
(29), 9 -► 2—so oxp maps 9 -► 2. Now, starting with 2, we have: 2 -► 9 under 
(172963), 9 -► 6 under (69), 6 -► 5 under (586)—so oxp maps 2 -► 5. Starting 
with 5, we have: 5-»3 under (28537), 3 ->4 under (1734)—so oxp maps 
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5 -► 4. Next: 4 -> 8 under (48), 8 -> 5 under (28537), 5 -> 8 under (586)—so 
GTp maps 4-> 8. Next: 8 ->4 under (48), 4-> 1 under (1734)—so Gxp maps 
8 -► 1. Therefore, (192548) is a cycle belonging to the factorization of Gxp. 

Similarly, starting with 3 we arrive at the cycle (376) as a factor of crip. 
Consequently, since all nine elements of {1, 2, ..., 9} are accounted for, we 
have indeed 


(Tip = (192548)(376). 

Once we know how to multiply, it is easy to take powers. Thus 

p\ = (15) 2 = (15)(15) = e , the identity of S 9 , 

p\ = (258) 2 = (258)(258) = (285), 

p 2 = (29835) 2 = (29835)(29835) = (28593). 

Clearly, squaring a cycle amounts to the mapping which takes every term of 
the cycle two places to the right. Accordingly, 

pi = (174628) 2 = (142)(768). ' 

Furthermore, raising a cycle to the power 3 amounts to the mapping that 
moves each term of the cycle three places to the right. For example, 

Pi = (15) 3 = (15), 

p| = (258) 3 = e, the identity of S 9 , 
p\ = (29835) 3 = (23958), 
p\ = (174628) 3 = (16)(27)(48). 

More generally, raising-a cycle to the power r > 1 involves assigning to each 
term of the cycle the term which is r places to the right. 

What about raising an arbitrary permutation to a positive power r? 
Because disjoint cycles commute, this is the same as raising each of its cycles 
to the power r. For example 

G r =’[(1734)(29)(586)] r = (1734y(29y(586y 

and, in particular, 


(j 2 = (1734) 2 (29) 2 (586) 2 = (13)(74)(568), 
a 3 = (1734) 3 (29) 3 (586) 3 = (1437)(29), 
a 4 = (1734) 4 (29) 4 (586) 4 = (586), 

and so on. 
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Finding inverses is easy. For a cycle, its inverse is obtained by writing the 
terms in the reverse order. Thus, for example, 

Pf 1 =(15) -1 =(51) = (15) 
p" 1 =(258)' 1 =(852) = (285) 
p 3 _1 = (29835)" 1 =(53892) 
p- 1 =(174628)- 1 =(826471). 

For an arbitrary permutation cr, expressed as a product of disjoint cycles 
<7102 * * * u'r, we have 

a~ l ={a l a 2 ---a r )~ l 

= (^r-i •••o- 2 o- i r 1 

— ^.-1^.-1 ... -“I 
— U i G 2 0p 

[This may also be arranged as follows: <x _1 = (crj ••• <r r ) -1 = <r r _1 ••• = 

o~^ y o^ 1 • • • <t,T 1 because, as is easily seen, for each i, crj 1 is a cycle whose 
nontrivial orbit is the same as that of c,, and consequently the cr f _ 1 commute 
with each other.] In particular, for a, t, p as before, we have 

a- 1 = [(1734)(29)(586)] “ 1 = (586)“ 1 (29)“ 1 (1734)“ 1 

= (685)(92)(4371) = (1437)(29)(568) 
t _1 = [(28537)(69)] _1 = (69)“ 1 (28537)- 1 

= (69)(73582) 

p ” 1 = [(172963)(48)] _1 = (369271)(48). 

The computation of (<rp) _1 , for example, can now be done in two different 
ways. In the first place (using the preceding computations) 

(op )- 1 = P~ l o~ l = [(369271)(48)] [(1437)(29)(568>] 

= (185973X46). 

Otherwise, one computes op and then takes its inverse, thus— 

(o-p)" 1 = [(1734)(29)(586)(172963)(48)] “ 1 
= [(137958)(46)] _1 
(46)(859731). 

In similar fashion, the reader may check that 

(po)~ l = (174562)(38) 

—one way to do this is to make use of the fact that 

po = (126547X38). 
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For any cr, t e S n it is convenient to introduce the notation 

<j T = Tar -1 

and call <j t the conjugate of a by t (or with less precision 66 a conjugate of a ”). 
This terminology is natural since <j t is the image of a under the inner auto¬ 
morphism (discussed in 4-4-14) J x \ S n ^> S n9 which is defined by I x : g -> Tar -1 
and is known as conjugation by t. Among the elementary properties of our 
new symbol, we have (for cr, t, p, o l9 cr 2 e S„) 

(GlGlY = cWi, 

O e = ( 7 , 

(a- 1 )' = (dO' 1 , 

(d T ) P = <7 pt . 

The verifications are trivial. 

As illustrations of the computation of conjugates we have, for the specific 
cr, t, p e S 9 treated above. 

o' = tot ” 1 = [(28537)(69)][(1734)(29)(586)][(69)(73582)] 

= (1274)(359)(68), 

t p = pip” 1 = (172963)(48)(28537)(69)(369271)(48) 

= (12945)(36). 

Actually, there is an easier way to compute conjugates and, even more, it is 
easy to decide if two permutations are conjugate. This is the content of our 
next result. 


4-5-8. Proposition. In S n , we have: 

(0 cr T has the same cycle structure as cr; in fact, to compute cr T one simply 
writes out the cycle expression of a and replaces each digit by its 
image under t. 

(//) p is a conjugate of cr <^> they have the same cycle structure. 

(iii) The relation 66 is a conjugate of” is an equivalence relation. Each 
equivalence class consists of all elements of S n which have the same 
cycle structure. 


Proof : We must clarify the meaning of “cycle structure.” Any permuta¬ 
tion cr e S n has an essentially unique representation as a product of disjoint 
cycles. The lengths of these various cycles (meaning: a list of all the lengths) 
constitute the cycle structure of cr. Because the cycles commute, the order in 
which these lengths are listed does not matter—it is the set of cycle lengths 
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which we care about. A permutation in S n is said to have the “ same cycle 
structure” as cr when its cycle lengths are the same as those of <7. For 
example, the cycle structure of g — (1734)(29)(586) e 5*9 is given by the 
numbers 4, 2, 3 (or by any permutation of these three numbers). Clearly, 
g = (13)(285)(7496) e S 9 has the same cycle structure as g —namely, the 
numbers 2, 3, 4. On the other hand, the cycle structure of t = (28537)(69) e S 9 
is given by the numbers 5, 2 (one might be extremely careful and write this 
cycle structure as 5, 2, 1, 1—thus using up all nine digits), so t does not have 
the same cycle structure as cr. 

(/) Before undertaking the proof, let us illustrate the procedure given 
here for computing g x as applied to the special case g — (1734)(29)(586), 
t = (28537)(69) in S 9 . The recipe says: in the expression for cr, do not change 
the cycle structure, but replace each digit by its image under t. Since the 
action of t is 1 -> 1, 2 -> 8, 3 -> 7, 4 -> 4, 5 -> 3, 6 -> 9, 7 -> 2, 8 -> 5, 9 -> 6, the 
result is 

(j T = (1274X86)(359), 

which coincides with the result of our earlier computation (at the end of 
4-5-7) of g x . 

To prove that, in general, the end result of our procedure is indeed cr T , 
we note first that within each cycle of the cycle-representation of cr, a pair 
of adjacent elements always consists of some element a and its image Ga 
under cr: 


a = (...)•"(•••» a, aa, .). (*) 

Our method calls for applying t to each entry; which leads to 

(...)•••(..., ra, raa , ...)•••(...). (**) 

This is the cycle-representation of some permutation in S n (after all, the entries 
are distinct because the entries of (*) are distinct and t is 1-1). The action 
of this permutation on any t a is^ ra -► t Ga. On the other hand, the action of 
cr T = tot -1 on any t a is 

t a -► (tctt _ 1 )(tcz) = XGa. 

It follows that (**) is the expression for cr T . Of course, it is obvious from the 
method that cr T has the same cycle structure as cr. 

( ii ) The implication => has just been proved. To prove the reverse implica¬ 
tion, write p beneath cr with each cycle beneath one of the same length—thus 


^ == (***) ***(***# go. * * *) ***(***)> 

p = ( ... X 
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where any 1-cycles (that is, elements which are fixed under the permutation) 
are to be included. If we write the entries of a in a row and the corresponding 
entries of p below them—the picture being 

( • * * a aa • • *\ 

Pb-) 

—then the top row contains each element of (1,..., n} exactly once, and so 
does the bottom row. Therefore, in our old notation, this represents a 
permutation in S n , call it t. According to the rule [proved in (/)] for com¬ 
puting <j t , it is clear that o z = p. 

To illustrate what is going on here, the permutation a' = (13)(285)(7469) 
is conjugate to cr — (1734)(29)(586) in S 9 because they have the same cycle 
structure. In fact, if we write 

/I 7342958 6\ /l 2345678 9\ 

T ~ (7 4691 328 5/ \7 1 6 9 2 5 4 8 3} 

= (17493652), 

then a' = a 1 . Incidentally, here as in general, there are several distinct choices 
for r. They arise because the cycles of a’ can themselves by cycled; thus, if 
we write a ' = (4697)(31)(852), then 

/I 7 3 4 2 9 5 8 6\ ,,. 7 *„ 0V „x 

t_ \4 6.9 7 3 1 8 5 2 )- <1476239)(58) 

and again cr' = cr T . 

(///) Let us write a ~ x to signify that t is a conjugate of cr. Then: o ~ o 

for all cr (since o e — cr); if cr ~ t, then t ~ cr (since t = a p => a = t ( p_1) ); if 

cr ~ t and t ~ p, then cr ~ p (since t — and p = t u imply p = (cr* 1 ) 0 = cr 0/i ). 
Thus, ~ is an equivalence relation and, in virtue of (//), any equivalence 
class (also known as a conjugate class) consists of all the elements of S n with 
the same cycle structure. | 

We turn next to additional consequences of the cycle decomposition of a 
permutation. 

4-5-9. Proposition. In the group S n9 a cycle of length r has order r. 
Furthermore, the order of any permutation is the least common multiple 
of the lengths of its disjoint cycles. 

Proof : A cycle of length r is of form 

a = • • • a r ) 

with the entries all distinct. For / = 1, 2,..., r — 1 it is clear that o l a t = 
a i+1 , so clearly a 1 ^ e . On the other hand, <j r aj = a } for j — 1, 2,..., r, so 
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since g y keeps each element of {1, 2which is not an aj fixed we have 
a r = e. This means that as an element of the group S n , g has order r —that 
is, ord g = r. 

Furthermore, if g is any permutation in S n9 let o = o t o 2 be its 
factorization into disjoint cycles. Because disjoint cycles commute, it follows 
that 

g 1 = g[g 2 • • • (t* for all i > 1. 

Now, o) (for each j — 1,..., n) need not be a cycle, but the set of elements 
it moves (meaning: the union of its nontrivial orbits) is contained in the single 
nontrivial orbit of Gj. Consequently, it is easy to see that g 1 = e if and only 
if each of crj, crj, ..., g[ is equal to e\ so ord g is the smallest i for which 
g[ = e, g 2 = e, ..., g\ — e. This number is clearly the least common 
multiple 

[ord g u ord g 2 ,..., ord g s ] 

—or, to put it another way, the least common multiple of the lengths of the 
disjoint cycles of g. | 

4-5-10. Proposition. 

(/) Any permutation in S n can be expressed as a product of trans¬ 
positions. In other words, the set of all transpositions generates S n . 

( ii ) The n — 1 transpositions (12), (13),..., (1«) generate S n . 

(Hi) The n — 1 transpositions (12), (23), (34), ...(«— 1, n) generate S n . 


Proof : Since any permutation can be expressed as a product of cycles, 
it suffices to express any cycle as a product of transpositions. But this is easy; 
a typical cycle looks like (<a t a 2 • • • fc r ), and it may be written as 

(a t a 2 ■••a r ) = (a t a r )(aia r - x ) • • • (a 1 a 3 )(a 1 a 2 ). 

(ii) It suffices to express an arbitrary transposition (ab) in terms of 
(12), (13),..., (1 n). But this is trivial, since 

(ab) = (\a)(\b)(\a). 

As a matter o f fact, his formula is not a complete surprise—for, according 
to 4-5-8, (ab) is conjugate to each of the transpositions (12),..., (lw). In 
particular, if p = (ab) and o = (1Z>), then upon putting t = (* £) = (1 a) we 
have p = <j t . Incidentally, we also have 

(ab) = (\b)(\a)(\b) 

—this being the case, p = (ab), o = (1 a) = (a\), r = («£) = (16) and p = g\ 
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(iii) In virtue of ( ii ), it suffices to express each transposition (12), (13), ..., 
(1 n) in terms of the transpositions (12), (23), ...,(« — l,w). This is clear 
from the formula 

(I/) = (/ - 1, 0 • • • (34)(23)(12)(23)(34) •••(/- 1, /), 2 <i<n 

which arises from the facts about conjugation. Namely, if one writes = (12), 
t 2 = (23), t 3 = (34), ..., t w _! = (n - 1, n), then 

r? = (23)(12)(23) _1 = (23)(12)(23) = (13) 

(AT = t t 3 x 2 = [(34)(23)](12)[(34)(23)] “ 1 

= (34)(23)(12)(23)(34) = (14) 

and one may proceed inductively. This completes the proof. | 

Obviously, any permutation can be expressed as a product of trans¬ 
positions in many ways; however, it is not obvious that for a given permuta¬ 
tion the number of transpositions in such a product is either always odd or 
always even. There are several approaches to this result, all of which involve 
some degree of artificiality. We take the most common approach. 


4-5-11. Discussion. Our permutations come from the group S n . There 
is no harm in assuming n > 3, since S n is rather trivial for n = 1 or 2. Let us 
take n independent formal symbols (or variables) x l9 x 2 ,..., x n and introduce 
the expression 


A = A„ = II ( x ; — x j)> 1 <i<j^n 

i<] 

(which may be referred to as the discriminant). For example, if n = 3, then 
A 3 = (*1 “ * 2 )(*1 “ *3)(*2 - * 3 ) (*) 


while, for n — 4, we have 


A 4 = (*! - *2>(*1 - - *4)(*2 - *3)(*2 “ * 4 )(*3 “ * 4 ) (**) 


which may also be written in the triangular form 


Ol - X 2 )(x l - XjX*! - X A ) 
(x 2 ~ X 3 )(X 2 - X 4 ) 
(x 3 - x 4 ). 
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More generally, for arbitrary n , the discriminant A„ may be expressed in the 
triangular form 


'(*1 - * 2)(*1 - *3) • • • (*1 ~ Xj) 
(X 2 - X 3 ) • • • (x 2 - Xj) 

A=A„= J (xt - Xj) 


(Xl~X„) 

(x 2 ~ x„) 


( x i - X„) (***) 


(x„-2 - X n — i)(x n — 2 - x„) 
(X„-1~ X„) 


The number of factors (x ; — x ( ) in A„ is clearly (n 2 — n)/2. 

It should be pointed out why A, which appears to be a product, should 
indeed be interpreted as a product. Starting from the integral domain Z we 
form the polynomial ring Z[xJ, which is an integral domain (see Section 
3-3). Next we form the polynomial ring (Z[x 1 ])([x 2 ]), which we write as 
Z[x i9 x 2 ] —it too being an integral domain. Proceeding inductively, we end 
with ( Z[x l9 ..., which is written as Z[x l9 ..., x n ] and is known as 

the “polynomial ring in n variables over Z"; it is an integral domain. (Our 
initial assumption that x l9 ... 9 x n are independent variables is designed as 
an intuitive way of saying: Each x t is an indeterminate over Z[x l9 ... 

—which then enables us to form the polynomial ring Z[x u ..., x t _ i9 atJ.) 
Each (x t — Xj) with i < j is a nonzero element of the domain Z[x u ..., x n ] and, 
consequently, their product A = A„ is a nonzero element of Z[x l9 ..., x n ]. 

Now, given any a e S n , let us put (for A = A„) 

O-A = n ( x ai - x a j), 1 <i<j <n. (#) 

i<j 

To illustrate this definition we note that in the case where o = (13)(24) e S 4 
we have [using (**)] 


oA = (x 3 - x 4 )(x 3 - Xj)(x 3 - x 2 )(x 4 - Xj)(x 4 - X 2 )(Xi - x 2 ). 


Incidentally, the reader will observe that this equals 


(Xj, - X 2 )[-(Xj, - x 3 )][-(xj - x 4 )][-(x 2 - x 3 )][-(x 2 - x 4 )](x 3 - x 4 ), 


which is precisely A. 

In order to understand the definition of <jA, it is useful to put it in the 
proper context. Given a permutation cr e S n we may define a permutation 
(also denoted by a) of our set of independent variables {x l9 ..., x n ) by putting 
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Next, we can define a mapping (still denoted by a —since no harm will come 
of it) 

a: Z[x l5 ..., x„] -► Z[x u ...,*„] 

as follows: an element of Z[x l5 ..., xj (that is, a polynomial in n variables 
over Z) is of form 

/ = f(x 1 , x 2 ,...,x n ) = '£a ilti2 . in 4 ‘x !, 2 • • • xj," 

where the exponents i l9 i 29 ... 9 i n are all greater than or equal to 0 and 
a iu i2t >>># e Z, and (T maps/to the element 

O'/ = /<Xrl, x a2 , • • •, x an ) = X a il>i2 iX'ixjz • • • X% . 

In other words, on Z, a is the identity map—that is, oa = a for every ae Z; 
on the set {x l9 ..., *„}, a acts as above—that is, ax t = x ai for / = 1 , 2 , ..., n\ 
a preserves sums and products of elements from Z and {x i9 ..., x n }. These 
words obscure a rather obvious definition; for example, if a = (13)(24) e S 4 
and the polynomial in Z[x t - x 29 x 3 , x 4 ] is 


f=f(x u x 2 , x 3 , x 4 ) 

= 3 + 5x x — x 3 + 2X[X, — 4x| + x 2 x|x| — 3 x t x 2 x 3 x 4 , 

then 


of = f(x al , x a2 , x a3 , x a4 ) 

= 3 + 5x 3 — X x + 2 * 3 X 4 — 4*1 + *4*1*1 — 3X3X4XJX2 . 

In short, one simply applies the original permutation a to the subscripts 
of the Xi s. 

It is easy to see that a is a one-to-one map of Z[*!,* 2 , ...,*„] onto 
itself and, even more, that a is an automorphism of the ring Z[x l9 * 2 , ..., *„]. 
Its inverse <j -1 is precisely the automorphism which we obtain by starting 
from the permutation <j -1 e S n . 

Of course, and this is the nub of the matter for us, when the automorphism 
(j is applied to the element A e Z [* l9 ...,*„] the result is the element defined 
earlier [in (#)] as a A —in other words, a A equals <jA! 


4-5-12. Proposition. If ueS n9 then <rA = +A. Thus we may define 
sgn (j (called 66 signum of a ”) to be either +1 or — 1 in such a way that 

a A = (sgn <7) A. 

The permutation <7 is said to be even or odd according as sgn <7 is +1 or 
- 1 . 
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Proof : Consider a single term (x ai — x aj ) of the product 

<7a= n o*i - ^)- 

i<j 

If oi < oj , we leave — x aj ) alone in the product, since it is one of the 
terms appearing in 

A = El (*; “ */)• 

i<j 

If oj < oi , we replace — x aj ) by — in the product. (Of course, 

we cannot have oi — oj since i ^ j.) Once this has been done for each (x ffi — x aj ) 
we see that <jA is equal to (— l) m (where m is the number of times that oj < oi) 
times a product of terms, each of which appears in A. The number of terms 
in this product is equal to the number of terms in <jA, which is equal to the 
number of terms in A—namely, ( n 2 — n)j 2. Furthermore, it is easy to see 
that the ( n 2 — n)/2 terms of the product are distinct (since if i < j, i' < j\ then 
x ai - x aj = x ci . - x cy implies / = /', j =/, and - x aj = -(x ai ,-x aj ) 
implies i = j\j = which is impossible); consequently the product is A. In 
short, 

oA — (— l) m A = ± A 

We do not care to evaluate m, but simply observe that (— \) m is what we have 
chosen to call sgn o. | 


4-5-13. Proposition. The sgn function has the following properties—for 
cr, t e S n 

(0 sgn e=l, 

(ii) sgn(<jr) = (sgn o )(sgn t), 

(Hi) sgn(cr _1 ) = (sgn a)" 1 = sgn o . 


Proof : (/) is trivial because ^A=A. The proof of (ii) consists of the 
observation, (ctt)A = (sgn <jt)A, coupled with the computation 

(crr)A = o( tA) 

= cr[(sgn t)A] fsince cr is a homomorphism, 

= (sgn t)(<jA) land oa = a for every a e Z 

= (sgn r)[(sgn cr)A] 

= (sgn cr)(sgn t)A. 

As for (m), by making use of (/) and (//) we have 

(sgn cr)(sgn o~ l ) = sg^crcr -1 ) = sgn e = 1. 
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Since both sgn o and sgn a 1 come from {+1, — 1}, and their product is 1, 
they must be equal. | 

In virtue of the preceding [part (//)] we have, for permutations 

even x even = odd x odd = even, 
even x odd = odd x even = odd. 

Furthermore, any a e S n can be written as a product of transpositions so, 
according to these rules, the “parity” of a (meaning: whether a is odd or 
even—that is, whether sgn a is — 1 or +1) is determined by the parities of 
the transpositions in the product. But these behave nicely because: 


4-5-14. Lemma. Every transposition in S n is odd. 


Proof : First let us examine the simplest transposition—namely, (12). 
When (12) is applied to A =A„ [as given in the triangular form (***)], only 
the terms in the first two rows are affected. Obviously, (12) maps (x t — x 2 ) 
to (x 2 — *i). Also, under (12), each (x t — x 3 ) with j> 2 [these being the 
remaining terms in the first row of (***)] is interchanged with (x 2 — Xj) t 
which is the term in the second row directly below it. Clearly, we then have 
(12) A = — A—so 

sgn(12) = — 1. 

Now, consider an arbitrary transposition p e S n . It is conjugate to (12) 
(by 4-5-8), so there exists t e S n for which 

p = (12) T = t(12)t _1 . 

Then, using 4-5-12 and the fact that the values of sgn are +1, 

sgn p = sgn[T(12)T _1 ] =(sgn r)(sgn(12))(sgn t" 1 ) 

= (sgn r)(sgn 12)(sgn t) 

= (sgn r) 2 (sgn 12) = sgn(12) = -1. 

This completes the proof. | 


4-5-15. Proposition. Let o e S n be given. Then the number of terms in 
any expression for g as a product of transpositions is always odd or 
always even. In fact, this number is odd or even according as the permuta¬ 
tion g is odd or even. 
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Proof : Consider two factorizations 

C == ^1^2 * <*s == ^1^2 *** T ^ 

where each and t 7 - is a transposition. Because sgn is multiplicative and sgn 
of any transposition is — 1, we have 

sgn cr = (— l) s = (— 1)*. 

Therefore, 

s is odd, o sgn o = — 1 o t is odd, 
s is even o sgn a = 1 o t is even. 

The validity of our proposition is now immediate. | 

This result explains why the terminology “ odd ” or “ even ” permutations 
was chosen (in 4-5-11) rather than some other set of words. 

4-5-16. Corollary. An r cycle is an even (odd) permutation <*> r is odd 
(even). 

Proof : An arbitrary r-cycle ( a ± a 2 ••• a r ) can be written as a product of 
r — 1 transpositions: {a x a r ) ••• (a^X#^). Since each transposition is an 
odd permutation, our assertion is immediate. | 

The classification of permutations into even and odd ones enables us to 
distinguish an important subgroup of the symmetric group—namely: 

4-5-17. Proposition. Let A n denote the set of all even permutations in 
S n . Then A n is a normal subgroup of S n with # A n = n !/2 and ( S n : A„) = 2. 
Furthermore, A n , which is known as the alternating group on n letters, is 
generated by the set of all 3-cycles. 

Proof : A n is a nonempty subset of the finite group S n , and it is closed 
under multiplication since the product of even permutations is even. Accord¬ 
ing to 4-2-3, A n is a subgroup of S „. 

For any t e A n and a e S n we have 

sgn(oT(T _1 ) = (sgn cr)(sgn r)(sgn cr -1 ) = sgn t 

—so ot<7 _1 e A n . Thus, A n is a normal subgroup. 

Fix any odd permutation p [for example, p = (12) would do]; so, of 
course, p _1 is also odd. Consider the set 

pA„ = {pt\xe A„}. 
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Every element of pA n is an odd permutation, being the product of an odd 
permutation and an even one. On the other hand, if a is any odd permutation, 
then p -1 <7 is even, being the product of two odd permutations; thus, p _1 cr e 
A n and a e pA n . This shows that pA n is the set of all odd permutations. 
Clearly, we have a 1-1 correspondence 

T<-»pr, t e A n 

between A n and pA n . Since any permutation is either even or odd, but not 
both, we have 


^n ^ P-^n ^ P^w .0* 


Therefore, 


#(A n ) — #(pA n ) — -(#<S„) — ; 

and because this is the coset decomposition of S n with respect to A n , the 
index of A n in S n is 2— (S n : A „) = 2. 

A nicer way to see all this is as follows: The sgn function provides a 
homomorphism from the group S n onto the multiplicative group with two 
elements, { ±1} [since sgn(cn;) = (sgn a)(sgn t)]. It is clear that the kernel is 
{t e S n | sgn t = 1} = A n , the set of all even permutations. Hence, by 4-4-6, 
A n is a normal subgroup of S n . According to 4-4-12, the factor group SJA n 
is isomorphic to the two element group { ±1}. In particular, there are 
exactly two cosets of A n in S n . Hence, (S n : A n ) = 2, and by Lagrange, 4-2-16, 
#A n = i(#S n )=n\/ 2. 

It remains to show that A n is generated by the set of all 3-cycles. In virtue 
of 4-5-16, every 3-cycle is even, so it belongs to A n . Conversely, given t e A n , 
it can be expressed as the product of an even number of transpositions. 
These transpositions can be arranged in adjacent pairs. Now, two adjacent 
transpositions are disjoint or they have one entry in common. (The possibility 
that they are identical is excluded because then their product is the identity • • • 
so there was no need to include such a pair in the product.) In the first case, 
the product of the two transpositions looks like 

(( ab)(cd ) = (acd)(abd), 
and in the second case it looks like 

(ab)(ac) — (( acb ). 

Consequently, t e A n is a product of 3-cycles, and the proof is complete. | 
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4-5-18. Exercise. We study a subgroup of S n whose origins are geo¬ 
metric. 

For any n > 3 consider the regular polygon with n sides. Place it in the 
plane with one vertex on the X-axis, and label its n vertices as indicated in 



the accompanying diagram. The set of all rigid motions of our regular 
n-g on forms a group. It is a subgroup of S n since every such motion is des¬ 
cribed by a permutation of the set of vertices {1, 2, ..., n}. An element of 
this group—which we call the nth dihedral group and denote by D n —is 
completely determined by the way it maps the vertices at 1 and 2; so # ( D n ) = 
2 n. 

Let (j e D n denote the rotation by Injn radians in a counterclockwise 
direction. As a permutation this takes the form 


cr = (123 •**«), 


so a has order n and a, a 2 , ..., a” -1 , a n = e are distinct. 

Let t denote the flip (that is, rotation by n radians) of our regular n-g on 
over with respect to the X-axis. It is easy to express t as a permutation; the 
result depends on whether n is even or odd—but in either case t is given as 
a product of disjoint transpositions. Clearly, t 2 = e. 

Compute or. Since ax maps 1 -> 2 and 2 -> 1, it follows that (err) 2 maps 
1 -> 1 and 2 -> 2. This suffices to guarantee that (or) 2 = e. Since t -1 = t, this 
implies 


xg — a l x = G n 1 t. (*) 

The subgroups of D n generated by g and by x are 


M ={e, <7, a 2 , <r" 1 }, [t] = (e, t}. 
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What is the subgroup of D n generated by the 2-element set {cr, t}? In virtue 
of (*), every element of this group [{cr, t}] can be reduced to the form 

crV, 0<r<n— 1, 0<^<1. 

These elements are distinct, and since there are In of them it follows that 

D n = {cr V 1 0 < r < « — 1 , 0<s<l}. 

Finally, we leave it for the reader to show that the symmetries of the 
regular n-g on form a group; it is D n . 


4-5-19 / PROBLEMS 


1. Give the compact cycle form of the following permutations: 

/I 2 3 4 5 6 7 8 \ /I 2 3 4 5 6 7 8 \ 

W \2 7 4 8 3 6 5 1/’ W \7 4 1 5 8 2 6 3/’ 

/I 2345678 9\ /I 2345678 9\ 
{Ul) \4 6 9 2 7 1 3 8 5/ W \6 3 5 1 4 9 7 2 S/' 

2. Express each of the following permutations in the two-line form: 

(i) (2468) in S 9 , (ii) (13579) in S 9 , 

(iii) (14923)(6875) in S 9 , (iv) (152)(378)(46) in S 8 . 


3. Given the permutations (in S 9 ) 


(j = (385)(21)(794), 

compute: 

(/) a 2 , x 2 , p 2 , 

(iii) r _1 , p~ l 
(v) GXp, XpG , pax 
(vii) < 7 t , T P , p a 


X = (15936)(27)(48), p = (2945)(186)(37) 


(h) f 3 , ^ 3 , P 3 , 

(it;) err, x p, per, ter, pr, crp, 
(t;i) cr _1 rp, err _1 p, ertp -1 , 

(w‘») (ff% (ty, (pT 


4. Find the order of each of the permutations in Problem 2. With cr, t, p 
as in Problem 3, find the order of cr, t, p, a 2 , t 3 , p _1 , gx 9 pax 9 GX~ l p, 

(t T. 

5. For any g e S n9 show that the orbits of cr -1 are the same as those of a. 
How do the orbits of cr 2 compare with those of cr? 


6 . The permutations cr = (28175)(36)(49) and p = (43962)(18)(57) are 
conjugate. Find x e S 9 for which p = cr T . Find three other choices for x. 
How many such t’s are there? 
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7. (/) List the six elements of S 3 in cycle form. Which ones are odd? 

Which are even? What is the order of each one? Give the decom¬ 
position of 5*3 into conjugate classes. 

(ii) Do the same thing for the 24 elements of S+. 

8. How many conjugate classes are there in 

(0 $5 > 00 $6 ? 

List a representative for each conjugate class. 

9. Show that any expression for the identity e of S n as a product of trans¬ 
positions has an even number of terms. 

10. Show that if two permutations are conjugate then they are both odd or 
they are both even. How do their orders compare? 

11. 0) I n S 9 , exhibit an odd permutation of order 10. Can you find an even 

permutation of order 10 ? 

(ii) Show that in S 8 every element of order 10 is odd. 

12. Suppose <7 £ S n is given as a product of cycles, which need not be disjoint. 
Show that (j is an even permutation o the product has an even number 
of cycles of even length. What is the condition for a to be odd? 

13. Express each element of S 4 as a product of transpositions using: 

(/) any transpositions, 

(ii) only the transpositions (12), (13), (14), 

(iii) only the transpositions (12), (23), (34). 

14. How would you express 4-5-10 as a statement about shuffling a deck of 
cards ? 

15. In connection with the proof that cr T has the same cycle structure as g 
[see 4-5-8, part (/)], does any account have to be taken of the digits that 
do not appear in the cycle expression for o (that is, those digits which 
are fixed under g) ? 

16. In connection with 4-5-11, show carefully that given g e S n it determines 
an automorphism g (denoted in the text by g) of the ring Z[x i9 ..., x n ]. 
Moreover, 


—in words, the automorphism of Z[x l9 ..., determined by g 1 e S n 
is the inverse of the automorphism of Z[x l9 ... 9 x n ] determined by 
GeS n . 
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17. Find the subgroup of S 4 generated by the two elements: 


(0 (12), (23), 
(Hi) (12), (1234), 
(v) (12), (234), 
(vii) (123), (1234), 


(») (12), (123), 

(iv) (12X34), (123), 
(vi) (12X34), (13X24), 
(oin) (123), (124). 


18. Find as many subgroups of S 4 as you can. Have you found them all? 
Which ones are normal ? 


19. Show that the symmetric group S„ is generated by the two elements 

<j = (12), r = (12 •••«). 


20. Prove that the alternating group A„ is generated by each of the following 
sets of 3-cycles: 

(/) {(1 ij) | i ¥=j}, ( ii) {(12/) |y = 3,..., «}. 

21. The alternating group A 4 has 12 elements. Show that it has no subgroup 
of order 6 . Show further that Klein’s 4-group (see 4-1-12, Problem 16) 
which consists of the four elements: (1), (12)(34), (14)(23), (13)(24), is a 
normal subgroup of . Find all subgroups of A 4 . 

22. Find all subgroups of the dihedral groups (see 4-5-18): 

(0 D 5 —this is the group of symmetries of the regular pentagon, and 
it has 10 elements, 

(ii) D 6 —this is the group of symmetries of the regular hexagon, and 
it has 12 elements, 

(Hi) Du, 

(iv) D n —this is the group of symmetries of the regular n-g on, and it has 
In elements. 

Can you interpret the subgroups geometrically ? Which ones are normal ? 

23. The dihedral group Z > 4 is the same as the octic group. 

(i) Is it a normal subgroup of S 4 ? 

(ii) Find the center 3 of Z> 4 . What is D 4 /3? 

(iii) Find the inner automorphisms of Z> 4 . 

(iv) Find the sets of conjugate elements in Z> 4 . 

(v) List all factor groups of Z> 4 . 

(vi) Describe all homomorphic images of Z> 4 . 

24. The regular tetrahedron, shown in the accompanying figure, has four 
vertices and four faces, each of which is an equilateral triangle. (Of 
course, the four faces are congruent.) 
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Show that the rigid motions of this solid figure form a group with 
12 elements. Compare this group, called the tetrahedral group, with the 
alternating group A 4 . What about the symmetries (that is, rotations) 
of the regular tetrahedron ? 


4-6. The Group Z* 

In this section we investigate some number-theoretic questions whose 
proper interpretation is in the language and framework of group theory. 
More precisely, we shall, in essence, be studying Z*, the multiplicative 
group of units in the ring Z m of residue classes (mod m). 

As the reader will observe, we turn in this section from our wordy dis¬ 
cursive style toward the brisk tight style ordinarily employed in mathematical 
writing. 

We begin with a standard definition from number theory. 

4-6-1. Definition. Fix an integer m > 1 and consider any integer a with 
(a, m) = 1. We say that a belongs to the exponent t (mod m) —or “ a belongs 
to t ” or “ a has exponent t ”—when t is the smallest positive integer for which 

a 1 = 1 (mod m). 

[Of course, such a t exists and is, in fact, less than or equal to (j)(m) since 
according to Euler-Fermat, 4-3-10, we know that a^ m) = 1 (mod m).] If a 
belongs to the exponent </>(m) (mod m) (which is the maximum exponent 
possible) we say that a is a primitive root (modTto), or that 66 m has the primitive 
root a” It is customary to denote a primitive root, if one exists, by g\ we 
shall conform to this usage. 
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4-6-2. Examples. As illustrations of the definition, the reader may check 
the following statements: 5 belongs to the exponent 2 (mod 12); 5 belongs 
to the exponent 4 (mod 13)—of course, 18 or any other integer congruent to 

5 (mod 13) also belongs to the exponent 4 (mod 13), since 18* = 5 l (mod 13) 
for all i\ 3 belongs to the exponent 3 (mod 13), and so does any integer 
congruent to 3 (mod 13); 14 does not belong to any exponent (mod 18)—the 
definition is not applicable, since (14, 18) ^ 1; 3 belongs to the exponent 

6 (mod 7) so, because </>(7) = 6, this says that 3 is a primitive root (mod 7). 
For m = 18 we have <£(18) = 6 and there exists af primitive root (mod 18)— 
for example, 5 is a primitive root. In particular, a primitive root can exist 
when m is not prime. On the other hand, a primitive root need not exist for 
every m\ in particular, for m — 24 we have <£(24) = 8 and none of the eight 
elements 1, 5, 7, 11, 13, 17, 19, 23 (these being the only integers which need 
to be tested) belongs to the exponent 8 (mod 24)—thus, there is no primitive 
root (mod 24). 

4-6-3. Remark. In a number of places we have discussed and used the 
correspondence between statements about congruence (mod m) and state¬ 
ments about Z m . To translate the preceding definition to the language of 
residue classes we make an initial observation: (< a , m) = 1 means that a 
(more precisely, \j ^\ m —but we shall be careless and drop the cumbersome 
notation) is an element of the group Z* (see 3-2-13 and 4-1-7). Then, clearly 

a belongs to the exponent t (mod m) 

o a has order t , as an element of the group Z*. 

From what we know about the order of an element in a finite group, the 
standard elementary properties of the “exponent” are now immediate. 
Among these properties we have the following: 

Suppose a belongs to the exponent t (mod m ); then 

(/) a n = 1 (mod m) => 1 1 n. (Use 4-3-7.) 

(ii) t\ 4>{m). (Use 4-3-9.) 

{Hi) a * = a j (mod m) o i = j (mod t). (Use 4-3-7.) 

(iv) a r belongs to the exponent t/(r, t). (Use 4-3-13.) 

In addition, the assertion that a is a primitive root (mod m) is equivalent to 
saying that a (rather, |_^J m ) is an element of order </>(m) in Z*. Since #( Z*) = 
(f>(m) this becomes: a is a generator of the group Z*, which is cyclic. To 
rephrase it: 

There exists a primitive root (mod m) o the group Z* is cyclic; 
and, in this situation, a primitive root (mod m) is the same as a 
generator of the cyclic group Z*. 
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In virtue of our results about cyclic groups, the following facts about primitive 
roots are immediate: 

(v) Suppose g is a primitive root (mod m); then g r is a primitive root 
o (r,</>(m)) = 1. (Use 4-3-13.) 

(vi) "'If m has a primitive root, then it has exactly </>(</>(m)) of them. (Use 

4-3-13.) 

(vii) If (#, m) = 1, then g is a primitive root (mod m) o the integers 
g, g 2 ,..., g^ m) constitute a reduced residue system (mod m). (Use 
3-2-22). 


Given the concepts on which we have focused, the answer to the following 
question is obviously of interest: For which values of m does there exist a 
primitive root (modm)? in other words, for which m is Z* cyclic? Some 
information is already available. According to 4-3-16, Z* is a cyclic group 
[of order </>(/?) = p — 1] for every prime p . Also, as indicated in 4-6-2, Zf 8 is 
cyclic (so Z* can be cyclic without m being prime) and Z* 4 is not cyclic 
(so not every Z* is cyclic). 

To study Z* for arbitrary m , let us write out the prime power factoriza¬ 
tion of m , once and for all. 


*» = r° P \ l P2 2 -"P r s s , 

where p l9 p 2 ,.. ♦, p s are distinct odd primes, the exponents r l9 r 2 ,..., r s are 
all greater than 0, and r 0 > 0 (r 0 = 0 occurs when m is odd). As is often the 
case in number theory, the prime 2 will play a special role, and for this reason 
we have distinguished it from the other primes. Sometimes it will be con¬ 
venient to put p 0 == 2. 

Before undertaking the detailed analysis of Z*, a preliminary (but 
basically familiar) definition is needed. Suppose we are given multiplicative 
groups G u G 2 ,..., G k with identities e u e 2 ,..., e k9 respectively. Consider 
the product set G i x G 2 x • • • x G k . It consists of all k -tuples (a l9 a 2 ,..., a k ) 
with aieGi for i = 1, 2,..., k. Define multiplication for elements of 
G t x G 2 x • • • x G k componentwise—that is, 

(a u a 2 ,.. •, a k ) • (b u b 2 ,..., b k ) = ( a t b u a 2 b 2 a k b k ). 

Then G 1 xG 2 x ••• x G k becomes a group with identity ( e u e 2 , ... ,e k ). 
This resulting group is denoted by G x x • * • x G k (there is no harm in using 
the same notation as for the product set) or by nf =1 G t ; it is known as the 
(external) direct product of the groups G l9 ..., G k . Other terminologies may 
also be used; for example, if all the groups G t are additive, the resulting 
group is often denoted by l © G t and called the (external) direct sum. 
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It is easy to see that the direct product group II?L { G ( is abelian if and 
only if each G t is abelian. We note also that nf = t G t is finite if and only if 
each G t is finite; and in this situation, the order of the direct product group 
is the product of the orders of the component groups— 

A final observation: The order of an element of the direct product is the 
least common multiple of the orders of the component terms; in more detail, 
for any element (a u a 2 ,..., a k ) e G 1 x G 2 x • • • x G k , we have 

ord(^, a 2 , ..., a k ) = [ord a u ord a 2 , ..., ord a k ]. 

[This is a consequence of: (a u a 2 ,..., a k ) n = (e l9 ..., e k ), the identity of 
II* G t , if and only if a" = e { for each 


4-6-4. Proposition. We have an isomorphism of groups: 


- 2 r <> 


7 * 


Z?r 


Proof : Note that when r 0 = 0 it is understood that the first term on the 
right, Z 2 ro , should be dropped completely. Also, when r 0 = 1 the group 
Zpo consists of a single element, so this term really makes no contribution 
to the right-hand side. 

Essentially, this result is a straightforward generalization of things known 
to us (for the case where the right-hand side has only two terms) from the 
work in Section 3-2 on the computation of the Euler </>-function. For this 
reason, we shall merely outline the proof here. 

Following the procedure of 3-2-17, now let us form the direct sum ring 
Z 2 ro © Z pi n© • • • © Z ps r s and define a mapping of 

Z m -► Z 2 r 0 © Z pi n© * • • © Z ps rs 

by the rule 

This mapping is well defined and a ring homomorphism. Because 2 ro , p\\ ..., 
p r s s are relatively prime in pairs, it follows from the Chinese remainder theorem 
that the mapping is surjective. Using the fact that both rings under considera¬ 
tion have m elements, it follows that we have isomorphic rings 

Z m « Z 2 r 0 © Z pi n © * * * © Z p/S . 
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Now, as noted in 3-2-18, if two commutative rings with unity are isomorphic 
then their sets of units are in 1-1 correspondence. Even more, the groups of 
units ofjhe two rings are clearly isomorphic. Thus, 

Zm ~ (Z 2 r 0 © Z pi n © • • • © 

Furthermore, as in 3-2-19, there is a natural 1-1 correspondence between the 
units of the ring Z 2 r 0 © Z pi n © • • • © Z Ps r s and the elements of the product set 
Z*r 0 x Z pi n x ••• x Z * s r a . Clearly, this provides an isomorphism of groups 

(Z 2 ,o © Z pi n © • • * © Z p/ 0* * Z 2% X Z* n X • • • X Z* p/S . 

The desired result is thereby proved. | 

Our basic question of when Z* is cyclic has been transferred to the group 
Z 2 r 0 x Z pi n x — x Z* s r s ; the analysis of this group is subsumed under the 
following general result. 

4-6-5. Proposition. Suppose G u G 2 ,..., G k are (multiplicative) groups 
of order n i9 n 2 ,..., n h , respectively. 

(/) If the direct product group G 1 x G 2 x • • • x G k is cyclic, then so is 
G { for each i — 1,2,...,/:. 

(ii) If G u G 2 ,..., G s are cyclic, then G l x G 2 x • • • x G k is cyclic <^> 
n u n 2 ,..., n k are relatively prime in pairs. 

Proof : (/) For each i— 1,..., k define a mapping, called the ith projection, 
Tii \ Gi x G 2 x • • • x G k —> Gi 

by putting 

1> ^2 9 ••• 9 &i 9 • • • 9 &k) = • 

This is a surjective homomorphism. Since any homomorphic image of a 
cyclic group is cyclic (4-4-15, Problem 14) we see that G t is cyclic. 

(//) First of all we note that n l9 n 2 ,..., n k are relatively prime in pairs 
o [n u n 2 ,..., n k ] = n x n 2 • • • n k . (This fact has already been stated in 1-5-13, 
Problem 20.) Here is one way to prove it. 

The positive integers n i9 ... 9 n k are relatively prime in pairs 

ffor each prime p and every pair ij from 

\{ 1, 2,..., k} with / ^ j we have min{v p (« f ), v p (w y )} = 0 

|for each prime p, the set {v^^),..., v p (n k )} 

(contains at most one nonzero term 

k 

o Y, v p( n i ) = max i v p( n i) | / = 1,...,/:} for each prime p 

/=i 


O « 1«2 •'•»» = [« 1 . «2 .• 
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Turning to the proof itself, suppose n u n l9 . n k are relatively prime 
in pairs. For each cyclic group G t choose a generator a t —so ord a t = n t . In 
the group x G 2 x • • • x G k , whose order is n 1 n 2 ***«&? the order of the 
element (a u a 2 ,..., a k ) is given by 

ord(a u a 2 ,...,a k ) = [ord a u ord a 2 ,..., ord a k ] 

= [n i9 n 2 , ... 9 n k ] 

= * i *2 •••**• 

Hence, the group G 1 x ••• x G k is cyclic and (a u ..., a fc ) is a generator. 
This proves the implication <=. 

Conversely, suppose n u n 2 ,...,«* are not relatively prime in pairs—so 
[*i,* 2 > ••• , n k ] ^ n i n 2 • * * n k . An arbitrary element of G 1 x • • • x G k is of 
form (b l9 ..., b k ) with b t e for / = 1,..., k. Denoting the order of b t by 
q x , we know that q t \ rti . Therefore, 

ord(Z>!,..., b k ) = [ord Z> l5 ..., ord 6*] 

= [# 1 > ^2 5 • • • 5 # fc ] 

| [n l9 n 29 ... 9 nj 

<n Y rt 2 ---n k . 

Thus, the group G x x • • • x G k of order riin 2 • • • n k contains no element of 
order n i n 2 ••• n k . Hence, G i x ••• x G k is not cyclic; and we have proved 
the implication =>. The entire proof is now complete. | 

Applying this result to the concrete situation at hand, we see that if 
Z* is cyclic, then each of the groups Z*r 0 , Z* in , ..., Z * s r s must be cyclic 
and their orders </>(2 ro ), <t>(p r \)> • • • > <t>(p r s s ) must be relatively prime in pairs. 
Because —PV ^iPi — 1) is even an< l </>(2 l ‘°) = 2 ro_1 is even when 

r 0 > 2, it follows that if Z* is cyclic, the only possibilities for m are: 2'°, 
pV, 2 ro p\ l with r 0 < 2. 

Clearly, the problem still facing us is to decide when Z * r (p any prime) is 
cyclic. We already know that Zj is cyclic for every prime p. A natural 
approach is then to suppose inductively that Z* r is cyclic and investigate if 
Z* r +i is also cyclic. One step in this direction is the following. 

4-6-6. Lemma. Suppose p is an odd prime and g is a primitive root 
(mod p r ), r > 1 (that is: g belongs to the exponent </>(//) (mod p r ), or 
equivalently, Z* r is cyclic and g is a generator); then 

(0 The exponent to which g belongs (mod// +1 ) is either <f>{p r+1 ) or 

<K/)- 

(//) g belongs to the exponent 4>{p r ) (mod p r+1 ) o g* (l/) = 1 (mod p r+i ). 
( iii ) g is a primitive root (mod // +1 ) o g^ pr) ^ 1 (mod p r+i ). 



496 


IV . GROUPS 


Proof r Let t be the exponent to which g belongs (mod /? r+1 ); so 

g f = 1 (mod /? r+1 ) (*) 

and because g<M pr+l) = 1 (mod/? r+1 ) we know that t \ (j>(p r+i ). On the other 
hand (*) implies g* = 1 (mod p r ); and because g belongs to </>(/? r ) (mod p r ), 
we know that 4>{p r ) \ t. The relation 

coupled with (/>(p r+i ) = p<l>(p r ), forces the conclusion: t = 4>(p r ) or t = 

Parts (//) and (iii) are now immediate; in fact, they are essentially re¬ 
statements of what has already been proved. | 

Next, we examine the passage from Z * to Z * 2 . Things go very nicely. 


4-6-7. Lemma. The group Z* 2 is cyclic. In fact, there exists a primitive root 
g (mod p) which is also a primitive root (mod p 2 ). 


Proof: Since Z* is cyclic, there exists a primitive root (mod/?). Choose 
any one, and call it g. We have g p ~ l = 1 (mod p). If g belongs to the exponent 
(j>(p 2 ) = p(p — 1) (mod p 2 ) we are finished. If not, then according to 4-6-6, 
g belongs to the exponent </>(/?) —p — 1, and we have 

g p ~ x = 1 (mod/? 2 ). (#) 

Now consider g + /?. It represents the same element of Z * as g , so it is a 
primitive root (mod /?). To show that g + p is also a primitive root (mod p 2 ) 
it suffices [according to 4-6-6, part (m)] to show 

(#+/?) p-1 # 1 (mod/? 2 ). 

But this is easy: 

{9 + P) p ~ l = 9 p ~ l + (P ~ ^)9 p ~ 2 P + P 2 (some integer) 

= 9 P ~ 1 +(p~ V)9 p ~ 2 P (mod/? 2 ) 

= 1 -g p ~ 2 p (mod/? 2 ). 

Because (g 9 p) = 1, surely p)(g p ~ 2 — so (g + p) p ~ 1 # 1 (mod /? 2 ). Thus, g +p 
is a primitive root (mod p 2 ) and the proof is complete. | 

The next step is to examine the passage from Z* r to Z* r+ i, in general. 
This goes in one fell swoop, and leads to: 
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4-6-8. Proposition. For the odd prime p , Z* r is cyclic for all r> 1. In 
fact, if g is any integer which is a primitive root (mod p) and (mod /? 2 ), 
then g is a primitive root (mod p r ) for all positive r. 

Proof : From the preceding, Z* and Z* 2 are cyclic and there exists an 
integer which is a primitive root (mod p) and (mod p 2 ). Take any such 
integer; call it g . We have then, (in virtue of 4-4-6), 

g p ~ l == l (mod/?) and 1 (mod/? 2 ). 

Therefore, we can write 

gp~ l = 1 + tp where p)(t. 

Now, suppose inductively that g is a primitive root (mod/?'). Then 
g<Kpn _ g(p-i)p r -' = (l + tp) pr ~\ so by the binomial theorem we have 

g* ipr) = i + (/- 1 )(Jp) + f 1( f-2 _1) (tp)2 + ‘" 

= 1 + tp r + /?' +1 (some integer) 

= 1 + tp r (mod p r+i ). 

Since pX t , we have g^ pY) ^ 1 (mod /? r+1 )—so according to 4-6-6, g is a primi¬ 
tive root (mod /?' +1 ). | 

It still remains to study the cyclicity of Z*r. The story here differs from 
that for Z*. 

4-6-9. Proposition. (/) The groups Z* and Z* 2 are cyclic; Z| 3 is not 
cyclic. 

(//) The group Z* r is not cyclic for any r > 3. 

(iii) Every element of Z* r (r > 3) has order less than or equal to 2 r_2 ; in 
fact, the order of every element divides 2 r_2 . 

(iv) The element 5 of Z*r (r > 3) has order 2 r_2 . 

Proof: The group Z* consists of a single element, while Zf 2 = Z* has 
two elements—so they are both cyclic. The group Z| 3 = Z| consists of the 
four residue classes 1, 3, 5, 7 and 

l 2 = 3 2 = 5 2 = 7 2 = 1 in Z| 

—so Z* 3 is not cyclic. This proves (/). 

The group Z| r has order </>(2') = 2 r_1 , so in order to prove (ii) it surely 
suffices to prove (iii); and for this it suffices to show that if a is any odd 
integer, then 

a 2 ' 2 = 1 (mod 2'), r > 3. (#) 



498 


IV. GROUPS 


The case r — 3 asserts a 2 = 1 (mod 8); it follows from writing a — 1 + 2t, so 
that a 2 — 1 + 4 t(t + 1), and noting that t(t + 1) is always even. (Of course, 
a 2 = 1 (mod 8) follows from the observations about Zf made in the proof 
of part (/). 

Now, suppose inductively that (#) is true for r —so we may write a 2 *" 2 — 
1 + 2 r t. Then 

a 2r i = (a 2r ~ 2 ) 2 = 1 + 2 r+ H + 2 2 T 2 = 1 (mod 2 r+i ). 

Thus, (#) holds for r + 1 and (///) is proved. 

As for («?), since the order of 5 in Z*r is a divisor of 2'*"" 2 , it suffices to show 

5 2 '- 3 # 1 (mod 2'), r > 3. 

Actually, we are somewhat more precise, and show that 

5 2 '- 3 3 1 + 2'- 1 (mod 20, r > 3. (# #) 

This is valid for r = 3 since 


5 = 1+ 4 (mod 2 3 ). 

Now, suppose inductively that (##) holds for r. Then 

5 2 '- 2 = (5 2r -y = +?t)2 (te Z) 

= 1 + 2 r + l 2r ~ 2 + 2 r+1 / + 2 2 7 + 2 2 ^ 2 
s 1 + 2 r (mod 2' +1 ). 

Consequently, (# #) holds for r + 1, which completes the proof. | 


4-6-10. Theorem. The group Z* is cyclic (that is, there exists a primitive 
root for m) o mis of form 2, 4, //, 2p\ where p is an odd prime. 


Proof: Combine the things we already know—especially 4-6-4, 4-6-5, 
4-6-8, and 4-6-9. | 

As an additional consequence of the foregoing discussion we have the 
following result. 


4-6-11. Theorem. For any m — 2 ro p\ l p r 2 * • • p r s % let 2(m) denote the 
maximum of the orders of all the elements in the group Z*. Then 2(m), 
which is called the universal exponent of m , is given by 

2(m) = [2(2-), cfW), • • •, 
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Proof : For any m 9 the order of any element of Z* divides </>(m)—so 
X(m) | (f)(m) 9 and X(m) = (f)(m ) o Z* is cyclic. 

Note that 2(2°) = 2(1) =1, by convention; X(2 l ) = 2(2) = (f)(2) = 1, 
since Z* is cyclic; 2(2°) = 2(4) = (f)(4) = 2, since Z* is cyclic; and for 
r 0 > 3, 2(2'°) = 2 ro ~ 2 9 by 4-6-9. 

To compute X(m) 9 in general, consider the isomorphic groups 


“2 r o 


7 * 


xz? 


Suppose r 0 > 3. Then, by 4-6-9, the element 5 has order 2(2'°) = 2 r °~ 2 in 
Zf ro . For i— 1,..., s choose a generator a { of the cyclic group Z* n [whose 
order is Then the element (5, a l9 a 2 ,..., a s ) of Z* ro x Z * t n x ••• 

x Z*r s has order [2(2'*°), (f)(p r i ) 9 ..., (f>(p r s s )]. Hence, 


2(m)>[2(2'°), 0CP10.0CPE-W- 


On the other hand, an arbitrary element of Zfro x Z* in x • • • x Z*^ is 
of form (b 0 , ..., 6 S ) where b 0 e Z| ro (so its order, # 0 , divides 2 ro_2 , by 

4-6-9), and b t e Z* iri for i = 1,..., s [so its order, q t , divides </>(/?);*]. The order 
of (b 0 ,b i9 ... 9 b s ) is [q 0 ,q i9 ..., q s ], and surely 


[«o > 




This implies 2(m) < [2(2 ro ), (j)(p r \) 9 ..., (/>(/^ s )]— so equality holds. 
The situation where r 0 < 3 is easy and is left to the reader. | 


4-6-12. Remark. The preceding result may be viewed as a generalization 
of Euler-Fermat. In more detail, if a is any integer relatively prime to m, 
then, according to Euler-Fermat, 


a 4 >{m) = j (mod m) 

(*) 

whereas 4-6-11 guarantees that 


a 2(m) = 1 (mod m). 

(**) 


[In fact, X(m) is best possible, in the sense that no smaller integer will have 
the same property for all a with (a 9 m) = 1.] 

Now, on occasion, X(m) can be much smaller than (f)(m) 9 so that (**) is a 
much stronger statement than (*). For example, suppose m = 5040 = 
2 4 • 3 2 • 5 • 7; then 

4>(m) = 4>(2 4 ’)4>(} 1 )4>(S)4>(1) 

= (2 3 )(3 • 2)(4)(6) 

= 1152 


Km) = [K2 4 l <K3 2 ), <K5), <K 7)] 
= [2 2 , 3 • 2, 4, 6] 

= 12 . 


while 
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Therefore, 

a 12 == 1 (mod 5040) whenever (a, 5040) = 1 

which is far more informative than a 1152 = 1 (mod 5040), the assertion of 
Euler-Fermat. 

4-6-13. Discussion. We turn our attention in a new direction. Suppose 
m is an integer for which the group Z* [whose order is </>(m)] is cyclic. Fix 
a generator g of Z*. By the properties of cyclic groups (see Section 4-3) we 
know that for integers i 9 j 

9 l = 9 j o i —j (mod 4>(m)) 

and, what is of major importance for us here, given any element a e Z* it 
can be expressed uniquely in the form 

a = g r where 0 < r < (j)(m) — 1. 

We shall write 


r — ind^ a 

and say that r is the index of a with respect to g. (The index depends on the 
choice of the generator g of Z*; when it is clear which g is being used, one 
often writes ind a instead of ind^ a). Consequently, ae Z* may be written as 

a — g md 9 a t 

Taking another element be Z*, we have b = g m6 e b ; and then 

a b = gi n dg a + m dgb' 

Of course, ind^ a + ind^ b may well be greater than 0(m) — 1, but because 

ab — g in6 d( ab \ 


it follows that 

ind g (ab) = ind^ a + ind^ b (mod (j>(m )), a, be Z*. (*) 

From this we also obtain 

ind g (cf) = n ind^ a (mod </>(m)), ae Z*, n > 1. 

To appreciate the full significance of (*), consider the mapping a -► ind^ a 
—more accurately, the mapping 

a Hind, a 1 ^ 
of 

{z*, *} ->{ z 0(m) , +}. 
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Translating (*) to residue classes (mod </>(ra)) we have 

KwL = l ind » a L ^ + l ind ^L , t . 

Hence, % is a homomorphism of groups [x(ab) = x( a ) + %(&)]. In fact, because 
both groups are cyclic with 4K m ) elements it is easy to see that x is an iso¬ 
morphism. The generator g of Z* is mapped under % to the generator 
1 =l_LLm) of { z ^(m)» +} (since ind 9 # = 1). 

The isomorphism x determined by ind^ depends on the choice of generator 
g of Z*; moreover, it is not hard to see that taking all possible choices for 
the generator g of Z* determines all isomorphisms from Z* to Z^ (m) . 

Because ind^ (or rather x) transforms multiplication into addition, it 
should appropriately be viewed as an analog of the “logarithm” function. 

4-6-14. Remark. In 4-6-13 we discussed the index as a function defined 
on elements of Z*. (Actually, everything could have been done with Z* 
replaced by an arbitrary finite cyclic group.) Of course, in the standard works 
on number theory one does not have the algebraic object Z*, viewed as a 
group. Our purpose here is to translate the story of the index, as given in 
4-6-13, into the language of congruences which is common in number theory. 
This requires only minor variations. 

Suppose m is an integer for which there exists a primitive root. Fix a 
primitive root g. For integers i,j we have 

g l = g J (mod m) o i = j (mod </>(ra)). 

Given an integer a with {a, m) — 1, there exists a unique integer r with 

g r = a (mod m), 0 < r < 4>(m) — 1. 

In other words, r is the smallest nonnegative integer for which g r = a (mod m). 
We write 

r — ind^ a 

and call it the “ index of a with respect to g ” (for the modulus m). The index 
satisfies the following properties—for a, a',b e Z all relatively prime to m: 

ind g a = ind g a' o a = a' (mod m), 
ind g (ab) = ind^ a + ind g b (mod 4>( m ))> 
ind g (a n ) = n(ind g a) (mod 0(m)), n > 1. 

4-6-15. Example. We do some computations with the “index.” The 
kinds of things which can be done arise from the fact that “ ind ” behaves 
like “log.” Of course, the index exists only when m = 2, 4, p\ 2p\ p an odd 
prime (as per 4-6-10). We shall not fuss with the distinction between md g 
defined on Z* , or simply on certain integers. 
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Consider m = p = 17; so by 4-6-10 there exists a primitive root (mod 17). 
To find one—here, and in general—is mostly a matter of brute force. Testing 
2, we find that 2 8 = 1 (mod 17), so 2 is not a primitive root (mod 17). Testing 
3 next, we find it to be a primitive root (mod 17). In fact, we have—with all 
congruences (mod 17)— 

3 1 ee 3, 3 2 = 9, 3 3 = 10, 3 4 = 13, 

3 5 = 5, 3 6 = 15, 3 7 = 11, 3 8 = 16, 

3 9 = 14, 3 10 = 8, 3 11 = 7, 3 12 = 4, 

3 13 = 12, 3 14 = 2, 3 15 = 6, 3 16 = 1. 

[Note: To see that 3 is a primitive root (mod 17) it is not necessary to 
compute beyond 3 8 = 16. We know at the start that the exponent to which 
3 belongs—that is, the order of 3 as an element of the group Z*— is a 
divisor of 16; so as soon as we arrive at 3 8 = 16^ 1 (mod 17), the conclusion 
is that 3 belongs to the exponent 16.] 

Putting g — 3 and writing “ind” for “ind 3 ” we have, in particular, ind 
10 = 3. It is useful, for computational purposes, to make a full table of 
indices. In the current situation, using our list of powers of 3, we have 

m=p=ll , g = 3 

a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

ind g a \ 16 14 1 12 5 15 11 10 2 3 7 13 4 9 6 8’ 

It is sometimes convenient to put this table in the form: 

a | 3 9 10 13 5 15 11 16 14 8 7 4 12 2 6 1 

ind, a | 1 2 3 4~ 5 6 7 8 9 10 11 12 13 14 15 16 

Because m= p = 17 is small it does not really matter which table is available; 
for large m , the choice of table depends on whether one wants to find ind, a 
when a is given, or to find a when ind, a is given. 

Parenthetically, we remark that there are </>(</>( 17)) = 8 primitive roots 

(mod 17). In fact, since 3 is a generator of a cyclic group of order </>(17) = 16, 

3" is a primitive root (mod 17) o ( n , </>(17)) = 1. Thus, the primitive roots 
are the odd powers of 3—namely, 

3, 10, 5, 11, 14, 7, 12, 6. 

Any one of these primitive roots could be used to make the table of indices. 
For example, using the primitive root g = 7, we have: 

m—p— 17, g — 1 

a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

ind g a | 16 10 3 4 15 13 1 14 6 9 5 7 12 11 2 8 
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The facts about the index are useful in solving certain kinds of con¬ 
gruences or equations in residue class rings. As illustrations, let us solve: 

(/) 8x 5 = 10 (mod 17)—that is: 8x 5 = 10 in Z 17 . 

(ii) lx 10 = 9 (mod 17)—that is: lx 10 = 9 in Z 17 . 

(iii) 5(7*) = 16 (mod 17)—that is: 5(7*) = 16 in Z 17 . 

We work with m = p = 17, g — 3 and write 66 ind ” for “ind r ” The treatment 
of (/) consists of: 

The integer x satisfies Sx 5 = 10 (mod 17) 

ind(8x 5 ) = ind 10 (use 4 _ 6 _ 14) 

o ind 8 + 5 ind x = ind 10 (mod 16) 

<^>10 + 5 ind x = 3 (mod 16) 

<£> 5 (ind x) = -7 = 9 (mod 16) 

<£> ind x — 5 
o x = 5 (mod 17). 

In other words, [_5j 17 is the unique solution of 8x 5 = 10 in Z 17 . 

Solving (ii) by this method leads to 

ind 7+10 ind x = ind 9 (mod 16) 

which becomes 

10 ind x = —9 = 7 (mod 16). 

Since (10, 16) = 2 does not divide 7, this linear congruence has no solution. 
Thus, the congruence 7x 10 = 9 (mod 17) has no solution. 

Solving (iii) by this method leads to 

ind 5 + x ind 7 = ind 16 (mod 16) 

which becomes 

llx = 3 (mod 16). 

This has the unique solution x = 9 (mod 16)!! 

Of course, the index can also be used for computations for certain com¬ 
posite m. Consider, for example, m = 22 = 2 • 11. Then Z* 2 is a cyclic group 
with </>(22) = 10 elements—these may be represented by 1,3, 5, 7, 9, 13, 15, 
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17, 19, 21. There exists a primitive root (mod 22). By trial and error; 3 and 5 
are not primitive roots, but 7 is. In fact, (mod 22) the powers of 7 give 

7 1 = 7, 7 2 = 49 = 5, 7 3 = 35 = 13, 7 4 = 91 = 3, 7 5 = 21, 

7 6 = 147 = 15, 7 7 = 105 = 17, 7 8 = 119 = 9, 

7 9 = 63 = 19, 7 10 = 133 = 1. 

this leads to the index table: 

m = 22, 9 = 1 

a | 1 3 5 7 9 13 15 17 19 21 
ind^a | 10 421836795 

There are </>(</>(22)) = 4 primitive roots (mod 22); they are given by T where 
(n 9 10) = 1, n < 10—namely, 7 1 , 7 3 , 7 7 , 7 9 . Using the table, the four primitive 
roots are 7, 13, 17, 19. 

If we try to solve lx 10 = 9 (mod 22) by using indices, then 
ind 7+10 ind * = ind 9 (mod </>(22)) 

and 


10 ind x =1 (mod 10). 

Hence, there is no solution. 

Note that, as it stands, lx 10 = 8 (mod 22) cannot be attacked via the 
index—because 8 is not relatively prime to 2, ind 8 has no meaning. 

4-6-16. Exercise. Let m be an integer for which there exists a primitive 
root g —that is, Z* is cyclic; let n be any integer greater than or equal to 1 
and put d = («, cf)(m )). Then: 

(/) The mapping a -► a n is a homomorphism of Z* -► Z*. Its image, 
which we denote by 

(Z*)" = {a n \ae Z*} 

is the subgroup (in Z*) of all nth. powers. Denoting the kernel by 
JV m> „ = {ae Z* |a" = 1} 
we have an isomorphism of groups ^ 

Z*JN m , n K m 

(ii) For a e Z* we have 

a e N mt „ o n ind <2 = 0 (mod </>(m)). 
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Therefore, the kernel N m n has exactly d elements. Furthermore, ( Z*)" is a 
cyclic group of order $(m)/d. 

( iii ) For be Z* we have 

be( Z*)" o 6* (m)/d = 1. 

(One way to do this is to show that each side is equivalent to d | ind b.) In 
the case where b e (Z*)", the equation x n = b has exactly d solutions in Z*. 

Rephrasing some of these results in the language of congruences we have 
with m , n , d as above, and b an integer relatively prime to m: The congruence 

x n = b (mod m) 

has a solution (in which case b is said to be an nth power residue of m) 
o d | ind b o b^ m)ld = 1 (mod m) 

and in this situation there are exactly d solutions (mod m). 

Furthermore, there are $(m)/d incongruent nth power residues of m; 
they are the roots of the congruence 

x <H m )i d = i ( moc j m y 


4-6-17 / PROBLEMS 

1. For each a satisfying 1 < a < m — 1 and (a, m) — 1 find the exponent to 
which a belongs (mod m) when m equals 

(0 7, 07) 11, (ill) 20, (iv) 21. 

2. Find the exponent of 13 (mod m) for m equal to 

(0 17, (») 19, 077) 22, (iv) 41. 

3. If a is an integer with d = 1 (mod m) for some r > 1, show that (a, m) — 1. 

4. If ab = 1 (mod m ), then a and b belong to the same exponent (mod m). 

5. Show that if a > 1 and m > 1, then m \ (j)(a m — 1). 

6. We know (Fermat): If p is prime and ( a,p ) = 1, then a p ~ l = 1 (mod/?). 
Show the falsity of the converse, by example—in other words, exhibit 
integers a 9 m with (a 9 m) = 1, m not prime and a m ~ l = 1 (mod m). 

7. 0) Suppose a is relatively prime to both m and n. If a belongs to the 

exponent r (mod m) and to the exponent s (mod n) 9 show that a 
belongs to the exponent [r, s] (mod [m 9 «]). 

(k) Can you give a group theoretic interpretation of this result? 

(iii) Generalize this result to: If (a 9 m l m 2 ••• m t ) = 1 and a belongs to 
the exponent r { (mod m ( ) i = 1,..., t 9 then a belongs to the exponent 
[r l9 r 2 ,..., r t ] (mod[ m l9 m l9 ... 9 m t ]). 
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8 . Find the exponent to which 7 belongs modulo 

(i) 11*13*17, (ii) 3*5*19, (iii) 3 3 • 5 2 • 17. 

9. How many primitive roots are there modulo 

(/) 13, (ii) 18, (iii) 19, (iv) 21, 

(v) 25, (vi) 28, (vii) 29? 

10. If g is a primitive root (mod p 2 ), show it is a primitive root (mod p). 
(Here, as elsewhere in this section, p is understood to be odd prime.) 

11. If g is a primitive root mod p and gg' = 1 (mod /?), then g' is a primitive 
root (mod /?). 

12. If g is a primitive root (mod p) show that # (p-1)/2 = — 1 (mod p). 

13. For p > 3, the product of the </>(/? — 1) primitive roots (mod p) is = 1 
(mod p). 

14. For each value of m, find the maximum order for the elements of the 
group Z*: 

(/) 18, (ii) 7 s , (iff) 3600 

(iv) 7200, ( v ) 2 2 • 17 3 • 37 • 67, (vi) 2 • 11 • 19 2 • 31 3 . 

15. If m and n are relatively prime positive integers, prove that A(m,«) = 

[ 2 (m), Mn)l 

16. Suppose ^ is a primitive root (mod p). If p = 1 (mod 4), prove that 
—# is a primitive root (mod p). What happens if p = 3 (mod 4)? 

17. If m > 6 has a primitive root then show that the product of the </>(</>(m)) 
primitive roots (mod m) is congruent to 1 (mod m). 

18. For m—p— 17 construct the table of indices when 
(i) g = 5, (ii) g = 6, (iii) g = 10. 

Then use each of these tables to solve the congruences 

Sx 5 = 10 (mod 17), lx 10 = 9 (mod 17), 5(7*) = 16 (mod 17). 

19. Find a primitive root (mod p) for each of the following primes 

(0 7, (ii) 11, (iii) 13, (iv) 19, (r) 23, , (vi) 29. 

In each case use the primitive root to construct a table of indices. Find 
all the primitive roots for each p . 

20. Find a primitive root modulo 

(0 9, (ii) 10, (iii) 14, (iv) 18, (v) 25, (vi) 27. 

Use the primitive root to construct a table of indices. In each case, find 
all the primitive roots. 

21. Solve the congruence 5x 7 = 8 (mod m) for each of the following values 
of m: 7, 9, 10, 11, 13, 14, 17, 18, 19, 22, 23, 25, 27, 29. 
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22. Do Problem 21 for 

(/) Sx 10 = 5 (mod m), ( ii ) 3x 6 = 7 (mod m). 

23. Solve x 5 == 4 (mod 7 • 9 • 13 • 17). 

24. Solve T 2 = 13 (mod 19). 

25. Exhibit integers a and b for which the congruence ax 6 = 6 (mod 22) 

(0 has a solution, (ii) does not have a solution. 

Do the same for ax 1 = b (mod 22). 

26. Show that for an odd prime p there exists a primitive root g (mod p) 
which is not a primitive root (mod p 2 ). 

27. What happens to 4-6-6, 4-6-7, 4-6-8 when p — 2 ? 

28. Prove directly: If there exists a primitive root (mod //), p an odd prime, 
then there exists a primitive root (mod p r+1 ). 

29. If both g and g f are primitive roots (mod p) 9 then show that gg' is not a 
primitive root (mod p). 

30. Suppose g is a primitive root (mod m), a e Z 9 (a 9 m) = 1. Show that 
(mod m) 9 a belongs to the exponent 

4>(m) 

(ind 9 a, <f>(m ))' 

Furthermore, 0 is a primitive root (mod m) o (ind^ a 9 4>( m )) = 1* 

31. Suppose g is a primitive root (mod /?'). If # is odd, then g is a primitive 
root (mod 2p r ). If g is even, then g + p r is a primitive root (mod 2p r ). 

32. If both g and g' are primitive roots (mod /?), show that for (a, p) — 1 

ind 9 - a = (ind 9 a){\nd g . g) (mod (p - 1)). 

This is the analog of the rule for changing the base of logarithms. 

33. If g is a primitive root (mod p 2 ), then 

= 1 (mod p 2 ) 

has the p — 1 distinct roots g np , n— 1, 2, 1, and no others. 

34. Discuss the circumstances under which 

ax n = b (mod p) 

has a solution. How many solutions are there ? 

35. Suppose the prime p is of form 2" + 1. If (|) = — 1 show that a is a 
primitive root (mod p). What about the converse? 
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36. (/) Consider the congruence 

x 3 = a (mod/?), where 1 < a< p — 1. 

Show that if p = 2 (mod 3), there is a solution for each a, and if 
p = 1 (mod 3), there is a solution for (p — l)/3 of the choices for a. 

(ii) Find a corresponding result for 

x 4 = a (mod p), 1 <a< p — 1 

which depends on whether p = 1 or 3 (mod 4). 

(iii) What about x 5 = a (mod p) and x 6 = a (mod /?)? 


Miscellaneous Problems 


1. Suppose G is an arbitrary group in which every element has finite order. 
If the nonempty subset H is closed under the operation, show that it is a 
subgroup. 

2. Suppose G is a subgroup of S n . If G contains an odd permutation, show 
that exactly half of its elements are even permutations, and that these 
form a subgroup. 

3. Show that in an ordered field F, the set of all positive elements forms a 
group under multiplication. 

4. Consider all maps (l> a>b : R -► R defined by 

<l>a, b ( x ) = ax + b, a, be R. 


Taking composition of maps as the operation, decide if these mappings 
form a group when a and b are restricted by: 

(/) a, b e Z. (iii) a ^ 0, a e R, be Z. 

(ii) a ^ 0, a e Z, b e R. ( iv ) a ^ 0, ae Q, be R. 

(v) a ^ 0, aeO, AgR, 6 £ Q. ^ 

When is the group commutative? 

5. Let R' = Ru {oo}, the set consisting of all real numbers and the element 
ex). Consider all mappings </>: R' -> R' of form 


4>{x) = 


ax + b 
cx + d’ 


where a 9 b, c,de R and ad — be = 1. Show that with respect to the opera¬ 
tion of composition, these mappings form a group. 
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6. Show that the additive group of rationals, {Q, + }, has no maximal proper 
subgroup. (Maximal means here that it is not contained in any other 
proper subgroup.) 

7. Suppose G is a commutative semigroup with cancellation. If G is finite 
then it is a group. If G is infinite then it can be imbedded in a group. 
How is this related to the imbedding of an integral domain in a field? 

8. Consider an additive abelian group, G, and the set S(G) of all endo- 
morphisms of G. (In other words, the elements of S(G) are homo- 
morphisms of G into itself.) For </>, \J/ e S(G) define </> + \j/ and </> © as 
usual, by 

(4> + </')(*) = 4>{a) + ip(a), 

(4> o f) ( a)= 4>ma)), aGU - 

Show that [S(G), +, °} is a ring with unity. What is K(Z), +, o} ? What 
isWZJ, +,o}? 

9. Suppose G is a finite group with subgroups Ka H c= G. Show that 

{G:K) = (G:H)(H:K). 

How is this related to Lagrange’s theorem, 4-2-16? What happens if the 
group G is infinite ? 

10. Suppose H u H 2 , H r are subgroups of G, each of finite index in G. 
Show that their intersection is of finite index in G. 

11. If F and H are subgroups of the finite group G, show that 

#(F) • #(H) < #(F n H) • #(Fv H). 

(Here Fv H denotes [F u H], the subgroup generated by the set F u H .) 

12. If F and H are finite subgroups of G, show that 

#(F) • #(H) = #(F n H) • #(FH). 

(Note that FH need not be a subgroup.) 

13. If F and H are subgroups of G, show that 

(F:F nH)<(FvH:H). 

Moreover, if both ( FvH:F ) and (FvH:H) are finite and relatively 
prime, show that 

(F: FvH) = (FvH: H) and (FI: F n H) = (FvH : F). 

14. If TV is a normal subgroup of the finite group G, show that N contains 
every subgroup of G whose order is prime to (G: N). 
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15. Show that any subgroup of prime power order, p r , contains a subgroup of 
order p. 

16. Let R be a commutative ring with unity, Land consider Jf(R, 2),the ring 
of 2 x 2 matrices with entries from R. For a matrix 

A=(^ ^ e Jt(R, 2) 

define its determinant by det^4 = ad — be ; it is an element of R. For 
A, Be Jt(R, 2) we have 

det (AB) = (det ^)(det B). 

Of course, det I — 1. 

Show that the matrix A has an inverse [that is, A is a unit of the ring 
JP(R, 2)] o det A is a unit of R. Moreover, when A has an inverse, find 
an expression for it. 

Apply these facts to find the group of units of the rings R, 2) and 
JOTL, 2). 

17. Consider the domain D = Map(Z> 0 , Z) of arithmetic functions (see 
Miscellaneous Problem 57, of Chapter II). Prove the following: 

(/) The group, G , of units of D is 

{/6i)|/(l)=+l}. 

(ii) The set H , of nonzero multiplicative arithmetic functions, is a 
subgroup of G 

( Hi ) The arithmetic function u given by 

u(n) = 1 for all n e Z >0 

belongs to H , and its inverse, w _1 , is the Mobius function, p. 

18. Let H = {cr e S 5 \a: {2, 4} ->{2, 4}}. Show that H is a subgroup of S 5 . 
Describe its left and right cosets. Is it normal? 

19. Consider G = {</>: R -> R | = ax + b, a ^ 0, a, b, x e R}. This is a 

group with respect to composition of mappings. Let // = {</> e G| = 1}, 
then H is a subgroup of G. Describe its left and right cosets. 

20. Prove the following: 

(/) The additive group of Z[/] is isomorphic to the direct sum of two 
infinite cyclic groups. 

(//) The additive group of the polynomial ring Z[x] is isomorphic to the 
multiplicative group of positive rational numbers. 
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21. If the sets X and X ' are in 1-1 correspondence, show that the symmetric 
groups S x and S x > are isomorphic. 


22. Suppose S is a nonempty subset of the group G, and put 

C G (S) = {ae G\ax = xa for all x e G}, N G (S ) = {a e G | aS = Sa}. 

These are known as the centralizer and normalizer of S in G, respectively. 
Prove the following: 

(i) C G (S ) is a subgroup of G. Note that C g (G) = 3 G , the center of G. 

( ii ) If H is a commutative subgroup of G then H is a normal subgroup 
of C g (H).. 

(iii) N g (S ) is a subgroup of G that contains C G (S). 

Suppose H is a subgroup of G, show that: 

(/p) H is normal if and only if H = N G (H). 

(v) H is normal in N G (H) ; in fact, N G (H ) is the unique largest subgroup 
of G in which H is normal. 

(pi) (G: N g (#)) equals the number of conjugates of H . 

23. Let B, C be subgroups (they need not be normal) of the group G with 
A=>B. Show that A n BC = B(A n C). This is known as the modular 
law of Dedekind. 

If, in addition, A n C = B n C and AC — BC , show that A— B. 


24. Call a matrix A of ^#(R, 2) or */#(Z, 2) nonsingular if it has an inverse. 
In each case, show that the set of all matrices of determinant 1 is a normal 
subgroup of the multiplicative group of all nonsingular matrices. 

25. Let C be the commutator subgroup of the group G (see 4-4-15, Problem 
8). Show that 

C = {a x a 2 ■ • • a n a~^al x ■ • • a~ x \a t e G, i = 1,..., n, n > 2} 
by using the identity 

(aba~ 1 b~ 1 )(cdc~ i d~ 1 ) 

= a(6a -1 )6 -1 • c(dc -1 )d -1 • a~ 1 (ab~ 1 )b • c -1 (cd -1 )d. 

26. If TV is a normal subgroup of G for which G/N is abelian then TV => C. 
What about the converse ? 


27. For a prime p consider 


Z (P 00 ) 


a 

? 


a, neZ 


Show that under addition this is an abelian group which is not cyclic. 
Furthermore, show that every proper subgroup is finite and cyclic. 
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28. Suppose the subset S generates the group G,G — [S], and #(S ) = m. 
If S is minimal in the sense that no subset of S generates G , is it true that 
any generating set of G has at least m elements ? 

29. Suppose G is abelian; then for any r>l, show that F = {<?| a e G) 
and H — {a e G\cf = e) are subgroups of G. If #(G) = n and (r, n) = 1, 
the map a -> a r is an automorphism of G. 

30. Suppose A is a normal subgroup of the finite group G. If G/N contains an 
element of order n, show that G contains an element of order n. 

31. Suppose G is cyclic of order n = rs , and let H be the subgroup of order ,s. 
Show that 

H = {aeG\a s = 1} = {<f\a e G}. 

32. If ord a — mn with (m, n) = 1, show that a can be written uniquely in the 
form a = be with be — cb and ord b = m, ord c = n ; in fact, both b and c 
are powers of a. 

33. Give an example of two elements of finite order whose product has infinite 
order. 

34. If G is cyclic of order n and G f is cyclic of order m, with m | n then G' is a 
homomorphic image of G. Find all homomorphic images of G. 

35. List all factor groups of each of the following groups (see 4-5-18): 

(/) D 5 , (//) D 6 , (iii) S 4 , (iv) S 5 . 

36. Exhibit at least six distinct groups of order 8. 

37. Describe all subgroups of {Z„, +}. 

38. Find the center of D n , the group of rigid motions of the regular ng on 
(i.e., the nth dihedral group of 4-5-18). 

39. Show that the group W n (see 4-2-8) has exactly n elements by making use 
of the fact that, in C[x ], the greatest common divisor of f(x) = x n - 1 
and its derivative f\x) is 1 (see Miscellaneous Problem 12 of Chapter III). 

40. A mapping of groups </>: G -> G' that is one-to-one and onto is said to be 
an anti-isomorphism when it has the property (j)(ab) = (j>{b)4>{a) for all 
a,beG. 

For aeG define R a eS G by R a (x) = xa, x e G; so R a is “right 
multiplication” by a in G. In analogy with Cayley’s theorem, 4-4-10, 
show that a -> R a is an anti-isomorphism of G into S G (in other words, it 
is an injective antihomomorphism). Note, incidentally, that L a and R b 
commute (as elements of S G ). 
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41. Suppose H is a subgroup of G and put X = {aH}, the set of left cosets of H 
in G. For b e G, define L b : X -> X by L b (aH) = baH. Then L b e S* and 
the mapping b-*L h is a homomorphism of G -> S*. What is the kernel ? 
What happens if we use right cosets instead of left cosets ? 

42. This is an expanded version of the isomorphism theorem, 4-4-12. Sup¬ 
pose 0: G -> G' is a surjective homomorphism with kernel N. With any 
subset S of G we may associate the subset 0(S) = {0(a)|aeS} of G'. 
Furthermore, with any subset S' of G' we may associate the subset 
0 -1 (S') = {a e G|0(a) e S'} of G; 0 _1 (S') is called the preimage of S'. 
Show that: 

(/) There is a 1-1 correspondence between subgroups H of G with 
H => N, and subgroups H' of G'. In fact, we have H<-+H' when 
0(H) = H' and 0 _1 (H') = H. (Note that G*-> G'.) 

(h) G/N « G'; in fact, the map 0: G/N->G' defined by <fi(aN) = 0(a) 
is such an isomorphism. Even more, 0 induces a homomorphism of 
H onto H' with kernel N, and 0 provides an isomorphism H/N « H'. 
(in ) H is normal in G o H' is normal in G'; and in this situation, 
G/H « G'/H'. 

43. Given groups N czH czG with JV normal in G, show that H is normal in 
G o H/H is normal in G/N; and in this situation, 

G/N _ G 

hJn~h' 

44. Suppose H and JV are subgroups of G with JV normal. If G = and 
H c\ N — ( e ), show that G/N « H. 

45. (0 Consider the cube. It has eight vertices and six congruent faces; it is 

sometimes called the regular hexahedron . The rigid motions form a 
group with 24 elements (known as the hexahedral group) which may be 
viewed as a subgroup of S s . Show that the symmetries (i.e., rotations 
about some axis) of the cube form a group; it is the hexahedral 
group. 

(ii) The cube has 4 diagonals and each of our motions is characterized 
by the way it permutes the diagonals. Show that the hexahedral 
group is isomorphic to S 4 . 

(iii) Take the midpoints of the faces of the cube, and connect them by 
straight lines. The result is a figure with 6 vertices and 8 congruent 
faces (what are the faces?) that is known as the regular octahedron. 
The rigid motions form a group with 24 elements (known as the 
octahedral group), which may be viewed as a subgroup of S 6 . Show 
that the symmetries constitute the same group, and that the octa¬ 
hedral and hexahedral groups are isomorphic. 
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46. (0 If n is odd, show that 

(a) In 2 + 1 is divisible by a prime p = 3 (mod 8) 

(b) 4n 2 + 1 is divisible by a prime p = 5 (mod 8) 

(//) If n is arbitrary, show that 

(a) 8 n 2 — 1 is divisible by a prime p = 7 (mod 8) 

(b) any odd prime p that divides w 4 + 1 is congruent to 1 (mod 8). 

(iii) Use the preceding to show that the number of primes of each of the 

forms Sn + 1, Sn + 3, Sn + 5, Sn + 7 is infinite. 

47. Consider two subgroups H , K (which need not be distinct) of the group 
G. A set of form HxK , x e G, is called a double coset. Then two double 
cosets are disjoint or identical; so G decomposes into a disjoint union of 
double cosets. Each double coset HxK can be expressed as a union of 
left cosets of K in G; the number of such left cosets is (H: H n xKx~ l ). 
Similarly, HxK is the union of right cosets of H in G; the number of such 
right cosets is (K: K n x~ 1 Hx). If G = (J a Hx a K is a decomposition into 
double cosets, show that 

(G: K) = Y(H:Hn x a Kx - 1 ) and (G: H) = £ (K: K n x^HxJ. 

a a 

48. (0 The multiplicative group G with identity e is said to operate or act 

on the set X if we have a map of G x X -> X, denoted by (<j, x) -> ax, 
satisfying ex — x and t(gx) = (tg)x for all cr, t e G, xe X. This may 
be rephrased as follows: we have a homomorphism of G -> S x (and 
the image of a in S x is also denoted by cr). Note that distinct elements 
of G may have the same action on X. 

( ii ) The set Gx = {gx\<t e G} is called the orbit of x with respect to G. 
We call x equivalent to y (with respect to G) if there exists cr e G such 
that ax — y (that is, if y e Gx). This is an equivalence relation, and 
the equivalence classes are the orbits Gx. Thus, X is a disjoint union 
of orbits Gx. Of course, xe Gx and Gx = Gy o y e Gx. 

(iii) Put X G = {x e X\ gx = x for all g e G}; so X G consists of all x e X 
whose orbits Gx consist of a single element. For xelwe also put 
G x — {g e G\gx = x}. Then G x is a subgroup of G and #(Gx) — 
(G: G x ). 

(iv) The following relation, known as the class equation, holds: 

#(*)= #(X g ) + Y(G:G x ), 

X 

where x runs over a system of representatives of all the orbits with 
more than one element. 

(v) Apply the preceding to the case X = G, where the action of G on X 
is by inner automorphisms—that is, (cr, t) -> x a = crrcr -1 . The orbits 
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are called conjugate classes. Elements in the same conjugate class are 
said to be conjugates of each other. Two conjugate classes are identi¬ 
cal or disjoint. We have G z = {a s G \T ff = ora -1 = t}, so it is the 
centralizer, C G (t), (see Problem 22) of t in G. Furthermore, G z = 
G o the orbit of t consists of one element o t e 3 C , the center 
of G. The class equation takes the form [where we write C z for C g (t)] 

#(<?) = #(3 c ) + I(G:C t ), 

T 

where t runs over a system of representatives of the conjugate classes 
with more than one element. 

49. Prove the following: 

(/) A subgroup H of G is normal if and only if H is the union of con¬ 
jugate classes. 

(//) The only finite group with exactly two conjugate classes is the group 
of order 2. 

50. Suppose G has order pq , where p and q are primes, with p <q. Show that 
G cannot have two distinct subgroups of order q. 

51. A group is said to be simple when it has no nontrivial normal subgroups. 
Show that: 

(/) An abelian group is simple if and only if it is cyclic of prime order. 

(ii) The alternating group A 4 is not simple. 

(iii) The alternating group A 5 is simple. 

(iv) For n> 5, the alternating group A n is simple. 

52. A nonempty subset A of the ring R is said to be an ideal of R when it 
satisfies the following two conditions: 

(1) a,beA=>a — beA; that is, A is a subgroup of the additive group of 

R; 

(2) For any r s R and as A both ra and ar are in A. 

(i) Show that (0) and R are ideals of R. Any other ideal is said to be 
a proper ideal. 

(ii) An ideal is a subring of R. Show that the converse is false. 

(iii) Suppose R has an identity, 1. If the ideal A contains 1, show 
that A — R ; even more, if A contains a unit, show that A = R. 

(iv) If R is commutative show that aR = [ar\r s R} is an ideal 
(known as a principal ideal). If, in addition, R has a 1, show that 
aR = Ra is the smallest ideal that contains a. 

(v) Show that a field has no proper ideals. 
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(vi) Show that the intersection of ideals is an ideal. 

(vii) Given ideals A and B , define their sum as 

A + B = {a + b\a e A, b e B} 
and show that it is an ideal. 

(viii) Given ideals A and B , define their product ^4 • B to be the set of 
all elements that can be written as a finite sum a l b 1 + a 2 b 2 + * * * 
+ a n b n where a u a 2 , ..., a n e /I and b i9 ..., b n e B (n is not 
fixed). Show that ,4 • B is an ideal with A' B A r\B. 

(ix) For an ideal A, show that A 1 = {x e RjxA = (0)} is an ideal. 
Note that the same result holds for an arbitrary subset A of R. 

(x) For an ideal A, show that (R: A) = {x e jR| Rx c= A} is an ideal 
that contains A. 

53 . By a left ideal of the ring R one means a nonempty subset A that is a 
group under addition (in other words, A is a subgroup of the additive 
group of R), and such that for any reR and aeA we have raeA. 
A right ideal of R is defined in analogous fashion. Discuss the extent 
to which the assertions of Problem 52 carry over to left ideals. 

54 . (/) Consider an ideal / of the ring R. Additively, I is a normal subgroup 

of the additive group of R , so we may form the factor group R/I, . 
Its elements are the cosets [_ ajj — a + /, and the addition in R/I is 
given by (a + I) + (b + I) = (a + b) + I. Now let us define multi¬ 
plication in R/I by 

(a + I)(b + I) = ab + I, a,beR. 

Show that this multiplication is well defined, and R/I becomes a 
ring. In addition, the canonical map a -> a + I is a surjective homo¬ 
morphism of R —+ R/I , and its kernel is /. 

(//) Suppose R , R' are rings and </>: R -> R' is a surjective homomor¬ 
phism, show that / = ker </> is an ideal of R , and that the factor ring 
R/I is isomorphic to R'. 

(iii) The preceding remarks indicate that ideals of a ring may be viewed 
as analogs of normal subgroups of a group Pursue this further by 
carrying the results of Problems 42 and 43 over to ideals. 

55 . (/) Define a Euclidean domain to be an integral domain D in which to 

every nonzero element ae D there is assigned an integer </>(a) > 0 
such that: 

(1) If a\b then <p{a) < c/)(b) 

(2) If a, be D with a ^ 0 then there exist elements q, r e D for 
which b — qa + r, where either r — 0 or (j)(r ) < </>(a) (This is a 
hard disjunctive—either or, but not both.) 
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Define a principal ideal domain (PID) to be a domain in which every 
ideal is principal. Show that a Euclidean domain is a PID. 

( ii) By a greatest common divisor (gcd) of two elements a, b of the 
domain D we mean an element de D such that 

(1) d\a and d\b 

(2) if c | a and c \ b then c | d. 

(A gcd need not exist, but if one exists it is unique up to a unit. 
Note that for a, b e Z>, a\b o Db a Da; also a and b are asso¬ 
ciates if and only if Da = Db.) Show that any two nonzero ele¬ 
ments a,b in a PID have a gcd, d. Moreover, there exist r,se D 
such that d= ra + sb. [Even though d is not unique, we write 
(a, b) = d.] 

(iii) In a PID, if (a, b) = 1 (in which case, we say that a and b are 
relatively prime) and a\bc, show that a\c. 

(iv) A nonzero element p e D is said to be irreducible if it cannot be 
written as a product of two elements neither of which is a unit (or 
equivalently, when p has no “proper” divisor—where by a proper 
divisor one means a divisor that is neither a unit nor an associate of 
p). The element peD is said to be prime if p\ab,p)(a => p\b. 
Show that: 

(1) In any domain, p is primes/? is irreducible. 

(2) In a PID, p is prime if and only if p is irreducible. 

(v) A unique factorization domain (UFD) is a domain in which 

(1) Every nonunit can be expressed as a product of irreducible 
elements. 

(2) Factorization into irreducible elements is unique up to order 
and units. 

Show that if D is a domain satisfying condition (1) for a UFD, then 
D is a UFD if and only if every irreducible element is prime. 

(vi) Let Da 0 cz Da i c: Da 2 c= • • • be an ascending chain (possibly infi¬ 
nite) of ideals of D , show that (J t Da t is an ideal. In a PID, 
there cannot exist an infinite strictly ascending chain of ideals 
(meaning that adjacent ideals in the chain are not equal). 

(vii) Prove: A PID is a UFD; hence, a Euclidean domain, is a UFD. 
(viii) Suppose D is a domain satisfying condition (1) for a UFD, show 

that D is a UFD if and only if every pair of nonzero elements has a 
gcd. 

56. Show that Z[i], the domain of Gaussian integers, becomes a Euclidean 
domain when we put, for a = a A- bi , 


</>(a) = N( a) = aa = (a + bi)(a — bi) = a 1 + b 2 e Z. 
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Thus, Z[/] is a unique factorization domain. To locate all the primes, the 

following observations (along with Miscellaneous Problem 66 of Chapter 

III) should prove useful. 

(/) If a e Z [/] then a divides N(oc); 

( ii ) If n is a prime of Z[i\ then n divides exactly one prime of Z; 

(iii) To find all primes of Z[/] it suffices to see how all primes of Z 
factor in Z[/] into primes; 

(iv) If p is a prime of Z and n is a prime of Z[i\ with n |p then N(n) = p or 
N(n) = p 2 . In the latter case, n and p are associates, so p is also 
prime in Z[i]. 

(; v ) If p = 3 (mod 4) there is no n = a + bi for which N{n) = a 2 + b 2 = p; 
so such p remain prime in Z[/]. 

(w) The prime factorization of/? = 2 is 2 = (1 + /)(1 — /), and the primes 
1 + /, 1 — i of Z[/] are associates. 

(w7) If p = 1 (mod4) then, by quadratic reciprocity, there exists neZ 
for which p | (1 + n 2 ). Then p | (n + i)(n — /), p)({n + /), pX(n — /) 
and p is not prime. In fact, p — where 7r 1? n 2 are primes of 

norm /?, and if — a -V bi then n 2 = a — bi. In particular, any 
prime p = 1 (mod 4) can be expressed as the sum of two squares, 
p = a 2 + b 2 . 

57 . Suppose H and N are normal subgroups of the multiplicative group G. 

If H n N = ( e ), show that HN (which, according to 4-4-15, Problem 10, 

is a subgroup of G) is isomorphic to the external direct product group 

H x N. 

58 . (/) Let G u G 2 , ..., G m be subgroups of the multiplicative group G. 

Prove that the following two conditions, I and II, are equivalent: 

( (1) Each Gi is normal in G. 

(2) G = G 1 G 2 • • • G m (This refers to the usual product of sets in a 
group, so it asserts that any a e G is of form a = a i a 2 • • • a m 
with a { e G { .) 

(3) Gi n (G x • • • G i . 1 G i+l • • • G m ) for each i = 1, ..., m. 


II 


(1) For i ^y, G t and Gj commute elementwise. (That is, a t aj = 
dj a t for all a t e G i9 dj e Gj .) 

(2) Every a e G can be expressed uniquely in the form a = 
, a t a 2 * * * a m , with 


When these conditions are satisfied we say that G is the (internal) 
direct product (or sum, when G is additive) of G u G 2 ,..., G m . We 
shall then write G = G x • G 2 • G 3 • • • G m . 
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( ii ) If G is the internal direct product of G l5 G 2 ,..., G m show that G is 

canonically isomorphic [under the correspondence a l a 2 a m <~* 
(a u a 2 , a m )} with the external direct product x G 2 x • • • 

xG m = nr=iG;- 

(iii) Conversely, if G u G 2 ,...,G m are arbitrary groups and we form their 

external direct product G = G t x G 2 x • ■ ■ x G„, show that for 
each i = 1,..., m there exists a subgroup G ; of G such that G { and 
G; are canonically isomorphic [under the correspondence a t <-» 
(1,..., 1, 1, .... 1)] and G is the internal direct product of 

G lt G 2 ,..G m . 

( iv ) If G is the internal direct product of G u ..., G m , show that for 
each j, G/Gj is isomorphic to IIj^i G ‘ ( anc * to ttie internal direct 
product of the G t , i # j, in the appropriate group). 

59. For any positive integer n, let 3„ denote “the” cyclic group of order n. 
Given m > 1, with prime factorization 

m = T'pJ' p 2 2 ■ ■■Ps s r o ^ 0, r u r 2 ,..., r s > 0 

show that 

2m * 32 x 32 , 'o- 2 x 3^,(p,-V) X ••• X 3^(p/ s) (external direct product) 

when r 0 > 3. If r 0 = 2, the second term, 3 2 , -o- 2 >is omitted; when r 0 = 0, 
1 the first two terms 3 2 ,32'-»- 2 are omitted. In particular, show that 
Z* is the internal direct product of such cyclic subgroups. 

60. Consider Ji {F, n), the ring of nxn matrices with entries from a field F. 
Prove that it is a simple ring —by which is meant that it has no proper 
ideals (see Problem 52). Hint: The notation introduced in Miscellaneous 
Problem 54 of Chapter II may be useful. 
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Problems for Chapter 1 

1-1-3 / ANSWERS 
1 . 26 ; 17 . 


no. of divisors 
sum of divisors 


5 . Use trial and error, or something “clever” if you see it. 

6. Proceed as in Problem 5; the answer to both questions is no. 

8. (/) 3, 6, 9, 0, —3, —6, ... in fact, all multiples of 3; in particular, 1 is 

not a linear combination of 3 and 6. 

(//) All integers; 1 is a linear combination of 3 and 5. 

9. (0 The set of all integers of form In as n runs over Z; in other words, 

the set of all multiples of 2. 

(iv) The set of all integers of form ax + by as both x and y run over Z ; 
in other words, the set of all linear combinations of a and b. 

(v) The set of all integers of form b — xa as x runs over Z. 

(vii) The set of all positive divisors of a. 


10 

17 

24 

72 

77 

79 

97 

210 

420 

4 

2 

8 

12 

4 

2 

2 

8 

24 

18 

18 

60 

195 

96 

80 

98 

576 

1344 
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(viii) The set of all positive integers n for which the relation 

" . n{n + 1) 

L l = --- 

i = 1 l 

holds. (This turns out to be the set of all positive integers, since 
l+2 + 3 + -- * + « = i[n(n + 1)].) 


1-2-7 / ANSWERS 

1. (i) {5 + 7/|/ = 0, 1,2,...}; («){5 + 7f|/ = 0, 1,2,...}; 

0*0 {2 + 7t\ t = 0, 1,2,.. .}; (id) {4 + 9t\t = 0, 1, 2,...}. 

2. For (i)-(v), the last nonzero remainder is 92. For (vi) and (; ;;), it is 113. 

5. In the first part, show that the product of 4m + 1 and 4 n -1- 1 can be 
expressed in the form 4k + 1. 

6. Use the fact that an arbitrary integer is of form 1 Oty + r with 0 < r < 9. 

10 . 8 is of form 3n — 1, but not of form 6 n + 5. 

11. 100 = 8 - 7 + 4- 11 ;99 = 11 - 7 + 2 • 11; 101 = 5 - 7 + 6- 11. 

12. It expresses the geometric idea discussed just before. 


1-3-13 I ANSWERS 

3. (/) (91, 143) = 13 = (— 3)(91) + (2)(143); 

(hi) (-143, —91) = 13 = (—2)(—143) + 3(—91); 

(iv) (5311, 7571) = 113 = (10)(5311) + ( —7)(7571); 

(v) (5311, -7571) = 113 = (10)(5311) + 7(-7571). 

6. (h) For example, we also have (x 0 + b)a + (y 0 — a)b = d. 

11. Show that ( m , n ) divides (w, v ). Then solve for m and n in terms of u 
and v . 

12. (/) Put d = (a + b, a — b) and show d 12. 

15. Clear, as soon as one observes that there exists some linear combination 
xa + yb that is positive. 

16. (/) Two integers are relatively prime if and only if some linear combi¬ 

nation of them is equal to one. 

(h) If an integer divides a product of two terms and is relatively prime 
to the first then it divides the second. 

(iii) If two relatively prime integers divide the same integer then their 
product also divides it. 

(iv) If an integer is relatively prime to each of two given integers, then it is 
relatively prime to their product. 
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1-4-8 / ANSWERS 

1. 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 
79, 83, 89, 97, 101, 103, 107, 109, 113, 127, 131, 137, 139, 149, 151, 157, 
163, 167, 173, 179, 181, 191, 193, 197, 199, 211, 223, 227, 229, 233, 239, 
241, 251, 257, 263, 269, 271, 277, 281, 283, 293. 

2. Consider the smallest prime p that divides n. 

3. 5 = l 2 + 2 2 , 13 = 2 2 + 3 2 , 17 = l 2 + 4 2 , 29 = 2 2 + 5 2 , 

37 = l 2 + 6 2 , 41 = 4 2 + 5 2 , 53 = 2 2 + 7 2 , 61 = 5 2 + 6 2 , 

73 = 3 2 + 8 2 , 89 = 5 2 + 8 2 , 97 = 4 2 + 9 2 , 101 = l 2 + 10 2 , etc. 

6. It works. 

9. (/) p 3 ; (ii) p; (iii) p ; (in) p 2 , or p 5 if pa — b = 0. 

10. (/) p or p 2 ; (ii) ;; 2 ; (/;;) p, p 2 , or /; 3 ; (//’) p 2 or p 3 . 

17. (0 T; (i’O T; (iii) T; (/») F: a = 3, 6 = 4, c = 3, p = 5; (®) T; 

(vi) T; (vii) F: a = 7 3 , b = 7 2 . 

19. No. 

1-5-13 / ANSWERS 

1. a =2 • 3 • 5 2 • 11 • 17, b = 3 • 5 • 7 2 • 13, (a, b) = 3 • 5, 

[a, 6] = 2 • 3 • 5 2 • 7 2 • 11 • 13 • 17. 

2. 2 3 • 3 2 • 5 • 7 • 11 • 13. 

3. (/) (a, 6) = 2 2 • 3, (a, c) = 5, (6, c) = 7; 

07) [a, 6] = 2 5 • 3 2 • 5 2 • 7 2 • 17, [a, c] = 2 5 • 3 2 • 5 2 • 7 • ll 2 • 19, 

[6, c] = 2 2 • 3 • 5 • 7 2 • ll 2 • 17 • 19; 

(iii) (a, b, c) = 1, [a, b, c] = 2 5 • 3 2 • 5 2 • 7 2 • 11 2 • 17 • 19. 

4. [5311, 7571] = 47 • 67 • 113 = (47 • 67 • 10)(5311) 

+ ((47)(67)(-7))(7571). 

5. 0) Oh &) = 137, (a, b,c) = l; 

07) [a, b] = 137 • 179 • 229, [a, b, c] = 137 • 179 • 229 • 251; 

(iii) [(a, b), c) = 137 • 179 • 251, ([a, b], c) = 179. 

8 . 1 and a(a + 1). 9. a and b. 

17. 0) F: a = 2, 6 = 3, c = 5; 00 T; (iii) T; (w) T; (v) T; (vi) T; 

(vii) T; (viii) F: a = 2, 6 = 5; (lv) T; (x) T; (xj) F: a = 2 3 , 

6 = 2 2 ; (x/7) T. 

19. For each p, min{v p (n), v p (6)} = 0. 

1-6-4 / ANSWERS 

1. 0) x = 588 + 33t, y = —1617 — 91t, or x = 27 -1- 33 1, y = —70 — 91/. 

00 No solutions. (iii) x = 19 — 43t, J? = 11 — 30/. 

O’) x = 3 + 38?, y = — ?. (w) x = 8?, y = —13?. 

(/;//) No solutions. (viii) No solutions. 

3. (?) The positive solutions are of form x = 604 + 7t, y = —1510 — 18?, 
where — 86 < ? < —84; so there are three of them: 2, 38; 9, 20; 16, 2. 
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(//) There are infinitely many: x = 604 — It, y — 1510 — 18/, where 
t< 83. 

(iii) x — 17 + 19/, y = 22 A- 27/, / > 0. 

(iv) No positive solutions. 

5. (0-0*0 are the same, (w) and (vi) are the same. 

6. 4 + 4- = 44; in fact, the number of answers is infinite. 

7. 1147 + 2184/, te Z. 

9. At some point one obtains a “ fraction ” that cannot take an integer value. 

10. (0 ^ = 5u — 2, y = 3/ — 3u + 2, z = 2/ — 4w + 2 for all /, w e Z. (i'O No 
solution. 

11. ll,;y = -19. 12. 1021. 13. (0 79. 

14. The story is fiction. 


1-7-16 / ANSWERS 

1. (0 mod 7: 3 = 500 = -312, 19 = 96, -15 = -71 = —113 = 69 = 
153, 378 = -91 = -14. 

(iii) mod 13: 19 = 240 = 500, —113 = 69, -91 = -312. 

3. m\a. 

4. 0; 18. 

5. Integer is divisible by 9 if and only if the sum of its digits is divisible by 9. 
Integer is divisible by 3 if and only if the sum of its digits is divisible by 3. 

6. Last digit is even; the number determined by the last two digits is divisible 
by 4; the number determined by the last three digits is divisible by 8 
[or if n = ••• + ••• + 1O 2 0 2 + lO 1 ^ + a 0 then 81 n o 81 (a 0 + 2a x + 4a 2 )]; 
last digit is 0; last digit is 0 or 5. 

8. (/) a in each case. (//) a in each case. 

12. |84| 5 = 1^16),, b ^ 7| 5 = | 53 j 5 , | 43| 7 = | 50) 7 , etc. 

13. (S) 


addition in Z 6 multiplication in Z 6 


+ 

0 

1 

2 

3 

4 

5 

• 

0 

1 

2 

3 

4 

5 

0 

0 

1 

2 

3 

4 

5 

0 

° 

0 

0 

0 

0 

0 

1 

1 

2 

3 

4 

5 

0 

1 

0 

1 

2 

3 

4 

5 

2 

2 

3 

4 

5 

0 

1 

2 

0 

2 

4 

0 

2 

4 

3 

3 

4 

5 

0 

1 

2 

3 

0 

3 

0 

3 

0 

3 

4 

4 

5 

0 

1 

2 

3 

4 

0 

4 

2 

0 

4 

2 

5 

5 

0 

1 

2 

3 

4 

5 

0 

5 

4 

3 

2 

1 
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15. As a set any residue class (mod 6) consists of two residue classes (mod 12). 
18. (0 n = 12; (ii) n = 3; (iii) n — 8; (iv) n — 16. 

20. Solve 48 = 100 a + lb — 10 c — 5 d, where 0<a<l,0<&<3, 0<c<6, 
0 <d <7. 

1-8-8 / ANSWERS 
1. ( iv) 


addition in base 6 multiplication in base 6 


+ 

0 

1 

2 

3 

4 

5 


0 

1 

2 

3 

4 

5 

0 

0 

1 

2 

3 

4 

5 

0 

i 

0 

0 

0 

0 

0 

0 

1 

1 

2 

3 

4 

5 

10 

1 

0 

1 

2 

3 

4 

5 

2 

2 

3 

4 

5 

10 

11 

2 

0 

2 

4 

10 

12 

14 

3 

3 

4 

5 

10 

11 

12 

3 

0 

3 

10 

13 

20 

23 

4 

4 

5 

10 

11 

12 

13 

4 

0 

4 

12 

20 

24 

32 

5 

5 

10 

11 

12 

13 

14 

5 

0 

5 

14 

23 

32 

41 


2. (i) (111001010001000111) 2 ; (iii) (100001232) 5 ; 

(fo) (5005543) 6 ; (vi) ( £ 38e3) 12 . 

3. (0 845; (ii) 15414; (iv) 14415; (v) 8096; (vi) 39711. 

5. (i) ( 1011022 ) 3 ; (H) ( 14202) 6 ; (iii) (4828) 12 ; (iv) (21422),. 

6 . (0 (1000110100 ) 2 ; 07) (121210),; (iv) (tsl9) I2 . 

7. 0) (1121 12 ) 3 ; 07) (104520) 6 ; (iii) (144460),; (it,) (519 £ ) 12 . 

8 . (0 (4411143) 6 ; (ii) (2604663),; (/«) (765622)j 2 . 

9. 0) q = (110100) 2 , r = (1) 2 ; (ii) q = (433) 5 , r = (14) 5 ; 

(iii) q = (95 ) 12 , r = (17) 12 . 

10.(0 1 ; 00 1 - 


Problems for Chapter II 

2-1-21 / ANSWERS 

3. addition in Z 3 multiplication in Z 3 


+ 

0 

1 

2 


0 

1 

2 

0 

0 

1 

2 

0 

0 

0 

0 

1 

1 

2 

0 

1 

0 

1 

2 

2 

2 

0 

1 

2 

0 

2 

1 
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4. In Z u : 5 • 8 = 7, 5 • 9 = 1, 5 • (8 + 9) = 8, (5 • 8)9 = 8, 5(8 • 9) = 8, 
9 • (8 • 5) = 8, 8(5 • 9) = 8, 8(9 + 5) = 2, 9(8 + 5) = 7. 

5. In Z u : 3 +(7+ 2) =1=2 +(3 + 7), (3(7 + 2))9 = 1 = (3 • 7)9 + 

3(2 • 9), (3 + 7)(2 + 9) = 0 = (3 • 2 + 7 • 2) + (3 • 9 + 7 • 9) = 

(3 • 2 + 3 • 9) + (7 • 2 + 7 • 9), 3(7 • 2) = 6 = 2(7 • 3), (3 + 7)(3 + 7) = 

1 = 3 2 + 7 2 + (3 • 7 + 3 • 7). 

6. In Z 12 : 6 • 2 = 2 • 6 = 0, 4 • 3 = 3 • 4 = 0, 10 • 6 = 6 • 10 = 0, 8 • 3 = 
3*8=0, 8 * 6 = 6 * 8 = 0, 6*4 = 4*6 = 0, 6*6 = 0, 4*9 = 9*4 = 0, 
8*9 = 9*8=0; 11*11 = 1, 7*7=1, 5*5 = 1, 1*1 = 1. 

7. Yes. 8. Z 2 . Yes, Z„, for example. 9. Yes. 

12. In Z 13 : (5- 10) +(7- 11)= -9 = 4 = (5 + 7) - (10 + 11),5(10-7) 
= 2 = 5* 10-5 *7,(5- 10)(7 + 11) = 1 = (5 • 7 - 10 • 11) + 

(5* 11 - 10 *7) = (5 *7 +5* 11)-(10*7 + 10* 11), (5- 10)(7 — 11) = 
7 = (5 * 7 + 10 * 11) — (10 • 7 + 5 * 11). 

13. (0 (a 2 + ab) + (ba + b 2 )\ ( ii) (a 2 — ab) + ( — ba + b 2 ). 

(iii) (a 2 + ab — ac) + ( — ba — b 2 + be) + (ca + cb — c 2 ). 

14. (/) (ta 2 + ab) + {ab + b 2 ), which we also write as a 2 + lab + b 2 . 

(ii) a 2 — lab + b 2 . (iii) (a 2 — b 2 — c 2 ) + Ibc. 

15. The relation holds for all a, b e R o R is commutative. 

16. There are many possible ways to insert the parentheses that describe the 
order in which the operations are to be performed. 


2-2-15 / ANSWERS 

1. (/) No; (ii) yes; (iii) yes; (iv) yes; (v) no. 

2. (/) Yes (provided it is understood that we always reduce to lowest terms— 
otherwise, not a ring); (//) yes (when it is not required that ( m,p r ) = 1); 
(iii) yes (under the natural assumptions). 

3. Yes. 4. It is an integral domain. 5. Yes. 

6. (/) No; (ii) no, yes; (///) it is a ring with unity. 

7. No. 8. No. 

9. Those that were found to be rings in Problems 1 and 2. 

10. (0 (0), Z 7 ; (ii) (0), Z 6 , {0, 3}, {0, 2, 4}. 

12. (i) Yes; (ii) no. 14. (iii) Yes. 

15. For a counterexample, look in Z 12 . 

16. (0 Yes, yes; (ii) yes; (iii) A = f)A x . 

18. (/) Yes; (ii) mZ + nZ = (m, n) Z. 

21. A subring that contains the identity. 

22. (iM + fl=(_!j (ii) A + B + C=[_l 
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(ffl)4-B-(* Jj); ’J); 

W-4BC-(;^ £); M ^(B + C) = (_-« ®). 


26. (i) A + B = 


3 

-3 

3 

2 


2 

2 

-3 

-6 


6 - 1 \ 

5 ° l 

0 3 I ’ 

1 1 / 


(ii) A — B = 


(/Yi) /IB = 



27. Yes. 

28. All except (nii) and (ix). 

31. (i) duB = {0, 1,2, 3, 5,7,9}. (ii) A n D = { 3}. 

(i/i) H x C = {(4, 4), (4, 6), (6, 4,) (6, 6)}. (in) /l — B = (0, 1}. 

(n) ^ = {4, 5, 6, 7, 8, 9}. (ni) A u A c = X. (vii) B n B c = 0. 

(niii) A — (B — C) = {0, 1}. (ix) (A n B) u C = {2, 3, 4, 6}. 

(x) A n(BuC) = {2, 3}. (xi) (// n 7) c = {0, 1, 2, 3, 4, 5, 7, 8, 9}. 

(xii) (F u G) c = (2, 4, 6, 7, 9}. (xiii) A - (B u C) = {0, 1}. 

(xin) A — (B n D) = {0, 1,2}. 

33. 2"; 3". 

34. Say X = {1, 2, 3} and put A = 0, B = {1}, C = {2}, D = {3}, E = {1, 2}, 
F = {1, 3}, G = {2, 3}, H = X. Then, for example, B + D = F, BD = A, 
F 4- G = F, etc. 

36. (i) Intersection of M Xo ’s, or sums, (ii) All subsets of X that do not 
include x 0 . 

39. Set up 1-1 correspondences between them, and which preserve the ring 
operations. 

41. Z 7 2 , ^36 © ^2 ’ ^8 © ^9 ’ ^18 © ^4 > ^tC. 

42. For each way of expressing m as a product, there is a ring with m 
elements. There are others. 

43. In Z 74 , take all multiples of 2. 
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46. The Euclidean plane, and the lattice of all integral points. 

47. R! n R" = R' n A = R" n A = {(0, 0)}, R' + R* = R' + A = R © R. 

2-3-20 / ANSWERS 

3. (0 There are several cases to consider. (//) The converse is true. 

(iii) For any odd power the above hold, but not for even powers. 

4. First treat < or >, then the situations where = occurs. 

5. Proceed as in Problem 4. 

6. Note that — e is negative. 

7. Consider (a — b ) 2 . 8. The case b < a is impossible. 

9. Use 2-3-8. 10. Yes. 12. Yes; no. 

13. Distinguish cases judiciously. 

15. (/) n(n + 1); (//) n 2 . 

16. (0 ^ {2a + (n- 1 )d }; (ii) ^ . 

20. Verify the assertion for n — 1; then supposing it true for n (inductively), 
show that it holds for n + 1. 

21. (0 No; (ii) no; (///) yes. 

23. The passage from n to n + 1 is “repeated” infinitely often,once for each 
choice of n ; but we prove it jtist once, for arbitrary n. 

2-4-10 1 ANSWERS 

1. There are 5! rearrangements. More generally, for any n, the summation 
of a t , a 2 , ..., a n has n \ rearrangements. 

2. If a u a 2 ,..., a n , n > 2, are elements of the commutative ring R then 
all products of these n elements, in any order, are equal. 

6. Yes. 

8. It is valid for any positive odd power, but not for even ones. 

10. If R is not commutative, the terms in the expansion of (a + b) n are all 
the possible ^-letter “words” in which each letter is either a or b . 

11. The sum of all terms of form 


where i +j + k = n and 0 < i 9 j, k <n. 

13. (iii) Anything of form (£ £) or (£ %). 

14. (0 330; (ii) 3003; (iii) (333)(499)(997). 
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2- 5-12 {ANSWERS 

1. (/) Yes, no; (ii) no, yes; (iii) \j/ o </> maps 1 -> 1, 2 -> 2, 3 -> 3, it is 
injective and surjective; </> o ^ maps 1 -> 1, 2 -»4, 3 -> 3, 4 -»4, it is 
neither injective nor surjective. 

2. For </>: Z -> Z, (/), (//), and (zV) are injective, whereas (/) and (iv) are sur¬ 
jective, and every other answer is no. For </>: Q->Q: as above except 
that (ii) is surjective. For </>: R -► R, (t>) is also surjective. 

3. For Z 3 : 

(0 
0*0 
(«0 

(iv) 

(v) 

For Z 5 : as above, except that for (i>) the row is all No. 

For Z 6 : row (/) is all No, with the rest as above. 

6. (/) / maps 1 —>2, 2—>1, 3—>3 \ g maps 1 -> 2, 2 -► 3, 3 -> 1. This example 
also works for (ii). 

7. /: n -> 2n; g: m -► m/2, when m is even and g: 0, when m is odd. 

9. t/^Cxr) = ba + bx , = a + for, </></>(.x) = 2a + x, = b 2 x. </> is 

always injective and surjective. Whether \ji is injective, or surjective 
depends on properties of b. The question is probably too general, but 
one does well to consider specific rings first. 

15. It is surjective and the kernel consists of the four elements |0| 24 , [^| 2 4 > 
H2|24,|18| 24 . 

16. </>: |a| m ; it is surjective; kernel = {|/^| m |£ = 0, 1, ..., ^}. If n ^m, </> 
is not well defined. 

17. 0 is unity element; the inverse isomorphism is a -► a + e. 

22. Suppose there is an isomorphism </>, and show this leads to a contradic¬ 
tion. 

Problems for Chapter III 

3- 1-12 / ANSWERS 

1. (/) 1; (ii) 3; (iii) 0. 

2. (/) 0; (//) 113; (iii) 1. 

4. (/) x = 6, 103 (mod 194); (ii) x = 74, 153 (mod 158); 

(iii) x = 434 (mod 605); (iv) x = 54, 127, 200, 273, 346, 419 (mod 438). 


inj. surj. homo. iso. 


Yes 

Yes 

No 

No 

Yes 

Yes 

No 

No 

Yes 

Yes 

No 

No 

No 

No 

No 

No 

Yes 

Yes 

Yes 

Yes 
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5- (0 [6| 24 , U4U,122| 24 ; (ii) [12| 26 , |25| 26 ; (m) no solution. 

6. 44* = 12 (mod 48). 

7. (/) x = 31 (mod 77); (m) x = 619 (mod 1092); (vii) no solution; 
(jjc) x = 31 (mod 60); (x) no solution; (xiii) x = 331 (mod 420). 

8. ( i ) H93| 285 ; (fc>) |71| M0 . 

9. One choice is 34x = 20 (mod 50) and \2x = 15 (mod 23). 

13. Unique solution is x = 97 (mod 2 2 • 3 2 • 7 • 11). 

14. There are three solutions (mod [24, 26]). 16. 23. 

17. * = 119 (mod 420). 


3-2-23 / ANSWERS 

1. (/) F; {ii) T; {iii) T; (iv) T; (v) F; {vi) T; {vii) F; (viii) T. 

2. (0 1,5; (ii) 1,2,4,7,8,11,13,14; (Hi) 1,5,7,11,13,17,19,23. 

3. (/) ± 1; (//) ± 1; (iii) all nonzero elements of Qf^/ —5]. 

4. 1= 1, 2 _1 = 7, 3" 1 = 9, 4 _1 = 10, 5 _1 = 8, 6" 1 = 11, 7" 1 = 2, 8" 1 = 
5, 9 _1 = 3, 10" 1 =4, ll" 1 =6, 12 -1 = 12. 

6. (0 Look in Q. 

7. Assuming m ^ 1, Q[>/m] is a field. 

9. (/) ±(2 + ^/3)", neZ; (ii) ±(5 + 2j6) n , n e Z; 

(i«) ±(8 + 3^/7)", n e Z. 

10 . |28j 851 and |100| s5) are units; their inverses are found by solving the 
appropriate congruence. 
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3 
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1 

13 

5 
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7 

7 

1 

9 

3 

5 

5 

1 

11 

3 

13 

9 

9 

9 

7 

3 

1 

9 

9 

13 

3 

11 

1 
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11 

11 

5 

13 

1 

9 

3 






13 

13 

11 

9 

5 

3 

1 


13. (/) 11 14 (10); (ii) 2 2 • 7 3 • 6 • 13 • 12 • 16 • 67 2 • 66; 

(iii) 2 • 5 • 4 • 3 2 • 2 • 6; (m)7 4 -6; (r) 2 7 • 3 2 • 11 • 23. 

15. 14. 17. 35, 39, 45, 52, 56, 70, 72, 78, 84, 90. 

19. (ii) The intersection is [£| [mj n] . 
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20. The kernel is [m u m 2 ,..., m r ] Z ; it is surely surjective if m u ..., m r are 
relatively prime in pairs. 

21. The result is always 1 (see 3-2-22, Part 7, Euler’s theorem). 

22. No. 24. (i) 24; (w) 27; (w7) 49; (to) 50; (v) 60. 

26. No. No, unless it contains 1. 


3-3-15 /ANSWERS 

1. (ii) (a) f + g = 3x 3 4- 2x 2 4- 3x 4- 2, / — g = 3x 3 4- x, 

f * g — 2x g 4- x 3 4~ 2x 4 4- 2x 3 4- x 4- 1. 

(b) f 4- g = Ax 3 4- 2x 2 4- 3x 4- 2, /— g = x 2 4- x 4- 4, 

/• g = 4x 6 4- 4x 5 4- 3x 4 4- 4x 3 4- 2x 2 4- x 4- 2. 

(d) f 4~ g — 6x 3 4~ 2x 2 4~ 3x 4- 2, f — g — 2x 3 4~ 3x 2 4~ x 4~ 4, 

f-g = x 6 + 3x 5 4- 5x 4 4- 5* 2 4- * 4- 4. 

4. Write c = cx° and apply the rule for multiplication. 

5. ax l bx j — abx i+J . 6. Only (/), (vi), and (vii) are subrings. 

7. None. 8. Yes. 9. 2 • 3 4 ; 2 • 3 5 ; 3 6 including the 0 polynomial. 

11. (0 and (to) are zero divisors. 12. Yes. 

14. (ii) They do not determine the same polynomial function. 

15. (0 f(0) = 2,/(l) = 2,/(2) = 2; g(x) = 2, h(x) = x 9 - x 3 + 2. 

16. 2x 2 — x 4- 1. 17. (ii) Yes. 

18. (/) None; (ii) none; (iii) none; (iv) 2; (v) 6. 

19. No, except in case (v). 

20. ker E c consists of all polynomials having c as a root; E c = E c >oc = c '. 

23. (w7) No. 

25. The units of i?[M] are those power series whose constant term a 0 is a 
unit of R. 


3-4-17 / ANSWERS 

1. (0 Q*; (w) R*; (m) {1}; (to) {1,2}; (v) {1, 2, 3, 4, 5, 6}; 

(w) {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; (wi) Z* ={1,2 — 1}. 

2. w(2x 2 — x 4- 1), where (0 w e { 1,2}; (») we Z* ; (w7) we Z* ; 

(to) weZ* t . 

3. (0 /i W and / 2 (x); (w) / 2 (x) and / 3 (x); (hi) f t (x) and / 4 (x). 

4. No. 5. (f), (iw)> (vii), and (iw77) are equivalence relations. 

6. Part (i7) is special. 

7. (0 x 2 — fx 4- f; (w) x 2 — 3x 4- 4. 
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8. (a)(V) 2x 6 — x + 5 = (fx 4 + fx 2 + ^-)(3x 2 — 2) + (—x + ^yt). 

9. (0 For/(x) = x — 3, g(x) = x 3 4- x 2 — 13x + 3, g(x) = (x 2 4- l)/(x). 

10. (iii) g(x) = (x 2 + x + 1 )/(x) + (2x 2 ). 

13. f(x)q(x) is always monic;/(x) + a(x) may or may not be monic. 

14. (0 (f(x), g(x)) = x 2 - fx - 1 = -i/(x) + (l)^(x). 

(iii) g(x) = (x 2 - x 4- 1 )/(x); so (/(x), g(x)) =/(x) = (l)/(x) + (O)g(x). 

15. (i) {f(x), g(x)) = 1 = (x 2 4- x)/(x) + (\)g(x). 

(«) (/(*), 9 {x)) = x + 1 = (x 3 + x)/(x) + (l)g(x). 

20. Proceed as for the integers. 

22. Why not? More on this in Section 3-5. 

23. (i) Irreducible; ( ii ) irreducible; (///) (x — co)(x — co 2 ), where co = 

-f: 3 

-^-; (iv) irreducible; (p) (x — l) 2 ; (vi) irreducible; ( vii ) 

(x + 3)(x — 2); (viii) irreducible. 

25. f(x) might be the product of two irreducible polynomials of degree 2. 

26. Every element of Z 7 is a root; 

x 1 — x = x(x — l)(x — 2)(x — 3)(x — 4)(x — 5)(x — 6). 

31. (0 No; (//) m = 2, 3, 6. 33. Proceed as in Section 1-5. 


3-5-29 {ANSWERS 

1. (i) f(x) = 3x — l; ( ii)f(x) = l . One is finding the equation of the 
straight line passing through two given points. 

2. (0 f(x) — 1 + \x 4- ^x 2 ; (iii) f(x) = § — \x 4- -fx 2 . 

3. 00 Over Z 5 , if/(0) = 1,/(1) = 2,/(2) = 4, then/(*) = 1 + 3x + 3x 2 . 

4. f(x) — 2x 3 — x 2 + 3x + 1. 6. f(x) = 3x 2 + 4x + 2. 

8. (0 10381; (//) 10381; 0*0 1; 0*0 1; 00 1; (w) 1; (wi) 6; 

(i?h7) 8. 

10. 00 Yes. 

12. Working with x 3 + x + 3: (i) no roots, i.e., irreducible; 

(ii) x(x 2 + 1); (iii) (x — l)(x 2 + x + 2); (iv) (x + 2)(x 2 + 5x + 5); 

(p) (x — 3)(x 2 + 3x — 1); 00 (x — 2) 2 (x — 4). 

13. O') x 2 — 13 is irreducible over Q, and factors as (x — yj\3)(x + ^/Ij) 
over R and C; (iii) x 2 — 5x + 6 = (x — 2)(x — 3) over Q, R, and C; 
(iv) x 2 + x + 1 is irreducible over Q and R, and over C factors as 

2 


(x — co)(x — co 2 ), where co 
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14. (ii) Over Z 3 ; x 2 — 13 is irreducible, x 2 + 13 is irreducible; 

x 2 — 5x + 6 = x(x +1); x 2 + x + 1 = (x — l) 2 ; 
x 2 + 2x — 2 = (x + l) 2 ; x 3 — 2 = (x -1- l) 3 = (x — 2) 3 ; 
x 3 + 2 = (x — l) 3 ; x 3 + x + 1 = (x — l)(x 2 — x — 1); 
x 3 + x 2 + 1 = (x — l)(x 2 + 2x — 1). 

15. (/) Some of them are: x, x + 1, x + 2, x 2 + 1, x 2 + x + 2, x 2 + 2x + 2, 

x 3 -f- 2x -t- 2. 

17. (i) ± ~ * j ; (»0 ±(2-0- 

18. Use the “ quadratic formula,” and find the square root as in Problem 17. 

19. (/) None; (ii) — 1; (iii) none; (iv) 1,-1; (*;) none; (vii) none. 

20. (ii) x 4 + 1 is irreducible over R; over C, 

x 4 + 1 = (x — a)(x + oc)(x — P)(x + /?), 


where a = 


1 + / 
7 




— 1 + i 

" 7 ^' 


21. (i) (x - 3)(x - 4)(x - 5)(x + 5). 

22. (0, (Hi), and (iv) are Eisenstein; (;7) is reducible. 


3-6-11 1 ANSWERS 

1. (0 [66)343i («0 |1353| 24O1 ,|1047| 24O1 . 

2. (/i?) 14 solutions: |4 + 49/| 343 , |46+49/| 343 for t — 0, 1, ..., 6. 

4. (//) (a) x = 0, 2 (mod 3), (6) x = 6, 5 (mod 9), (c) x = 6, 23 
(mod 27), (d) x = 60, 23 (mod 81). (w7) (a) x = 1 (mod 3), 

(b) x = 1, 4, —2 (mod 3 2 ), (c) x = 4, 13, 22 (mod 27), (d) none. 

5. 0*0 (a) H| 2) (*) HU, (c) (3) 8 , (d) |11| 16 , (e) |H| 32 , (/)143| 32 . 

6. (k) None; (r)(u)|0j 5 , (b) [15| 25 , (c)[65j 125 . 

7. ( iv ) None; (vii) (a) x = 0, 6 (mod 7), (b) x = —7, 6 (mod 7 2 ), 

(c) jcs -56, 55 (mod 7 3 ). 

8. For the polynomial x 2 — 2x + 3 of (»), no solutions. For the polyno¬ 
mial x 2 + lx — 5 of (v): (a) x = 5 (mod 15), (6) no solution, (c) x = 
65 (mod 75), (d) no solution. 

9. For the polynomial x 3 + x 2 — 4 of (iv), no solutions. For the polyno¬ 
mial x 2 + x + 7 of (vii), in case (d) there are six solutions: x = 13, 49, 76, 
112, 139, 175 (mod 189). 

12. (/) x = 2, 5, —3 (mod 419); (ii) x = 2, 5, —3 (mod 463); (iii) nine 
solutions: x = 2, 44, 23, 54, 5, 75, 67, 18,-3 (mod 91). 
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13. (i) 0; (ii) 1; (iii) 1; (to) 3; (v) 18. 

15. On the face of it, the following possibilities occur: 0, 1, 2, 3 ,p,p+ 1, 
p + 2, 2 p, 2 p + 1, 3 p. 


3-7-15 / ANSWERS 


1. Quadratic residues are: (i) 1, 4, 9, 16, 8, 2, 15, 13; (ii) 1, 4, 9, 16, 6, 17, 
11, 7, 5; (»7) 1, 4, 9, 16, 2, 13, 3, 18, 12, 8, 6. 

2. Everything (mod 13): (/) +1; (ii) ±4; (iii) +2; (to) none; 

(a) none; (vi) +3; (vii) +6; (viii) ±5. 

3. Use the technique of Section 3-6: (0 x = +1 (mod 13 2 ); (to) none; 
(vii) x = 32, 137 (mod 13 2 ). 


10. Note Problem 4. 11. ab is a quadratic nonresidue. 


- - 1 . 


12 . 


5 

-7 

11 

-11 

13 

-13 

97 

101 

103 

617 

619 

911 

No 

No 

No 

Yes 

No 

No 


14. (/) 2; (iii) 0; (v) 2; (vii) 0. 15. Two. 

16. (//) No solutions. 17. (i) 1; (iii) —1; (vi) 1; ( ix ) 1. 

18. (//) If x 2 = q (mod p) has no solution then x 2 = p (mod q) has no solu¬ 

tion. If x 2 = q (mod p) has a solution, then x 2 = p (mod q) has 
two solutions. 

21. p== 1,5, 7, 9, 19, 25, 35, 37, 39, 43 (mod 44). 

22. (ii) p= ±1, ±3, ±9, ±13 (mod 40). 

23. p = ± 1 (mod 24). 

27. 2', where r is the number of distinct primes dividing 6. 

28. c = 0 (mod 7), two solutions; c = — 1 (mod 7), two solutions; 

c = 2 (mod 7), two solutions; c = —2 (mod 7), one solution. 

29. (i) Yes, no; (//) 18 under the restriction (a, 91) = 1, 28 otherwise. 

30. One. 32. (ii) It has a solution; (iii) no solution. 
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Problems for Chapter IV 


4-1-12 I ANSWERS 


1. Use cancellation. 

3. Use induction on n for n > 0; then treat negative n. 

6. The following are groups: (iii), (vi), (xi), (xiii), (xxii), (xxvi), (xxvii)- 
( xxix ) and (xxxii). 

7. (0 They form a group with table: 



e 

a 

b 

c 

e 

e 

a 

b 

c 

a 

a 

e 

c 

b 

b 

b 

c 

e 

a 

c 

c 

b 

a 

e 


(//) They form a group with the same table. 

8. The eight elements of the group of the square may be viewed as permu¬ 
tations of the set {1, 2, 3, 4}—that is, as elements of S 4 . The multiplica¬ 
tions are the same, so the octic group can be considered a “ subgroup ” of 
S 4 . 

9. In 3-2-11, one is dealing with a finite semigroup with cancellation. 

10. The left multiplication, cf ) a , need not be a map onto G. 

11. For (i)-(iii), label the elements of {Z 2 , +} appropriately; for (iv)-(vi ) 9 
use {Z 3 , +}. 

12. No. 


<» - (i 

2 3 4 5 6' 
2 3 4 5 6' 


070 p 7 = ^ 

2 3 4 5 6 ' 
6 3 7 5 2] 

'H 

(iv) ff_1 = 

12 3 4 5 6 
>51673 


(vii) it»i = | 

1 2 3 4 5 6 

4 2 1 3 7 5 


( x ) T°<7° T _ 

x_/l 2 3 4 
13 7 4 5 

5 6 

6 2 
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(xiii) a 7 = 


2 3 4 5 6 7\ 

2 3 4 5 6 7/- 


16. The elements are 


e, a = 


l\ 

2 


2 3 4\ 

1 4 3/’ 


T = 


/I 2 3 
\4 3 2 



The multiplication table takes the form: 


P = 


( 


1 2 3 4\ 
3412/* 



e 

a 

T 

P 

e 

e 

a 

T 

P 

a 

a 

e 

P 

T 

*s T 

T 

P 

e 

a 

? 

P 

T 

a 

e 


The four-group may be viewed as a “subgroup” of the octic group, 
4-1-10, with e = e, a = x 2 , t = x u p = o 2 . The symmetries are same as 
rigid motions. 

17. The ten elements are: 


_/l 2 3 4 5\ _/l 2 3 4 

' \1 2 3 4 5)’ \2 3 4 5 

_/l 2 3 4 5\ _/l 2 3 4 

\3 4 5 i 2/’ \4 5 1 2 

_/l 2 3 4 5\ _/l 2 3 4 

Tl \l 5 4 3 2/’ \3 2 1 5 

_/l 2 3 4 5\ /I 2 3 4 

T * \2 1 5 4 3}’ Zs \4 3 2 1 


?)■ 

)■ "‘"(s 

(J 


2 

1 

2 

4 


<)■ 

9 - 


19. a and b should not commute. 20. Look at the pairs {a, a -1 }. 

22. The identity, which is the set X itself, is the only unit. 

23. Only («>) gives a group. 

25. The units are: 

(0 {1,2, 3, 4}; (//) {(1, 1), (1, -1), (-1, 1), (-1, -1)}. 

(Hi) {(1, 1), (1, 2), (2, 1), (2, 2)}. 

(iv) {(1, 1), (1, 3), (3, 1), (3, 3)}; (t>) {(1, 1), (1, 3), (2, 1), (2, 3)}. 
(vi) {(1, 1), (1, — 1), (2, 1), (2, -1)}. 

27. Use x->e* and x -> log x. 
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30. (0 


In general, 


has an inverse 



o A = ad — be ^ 0; and in this situation, an inverse is 


d b\ 

A “A 

c a 

\~ A A/ 

(w) If A e JtijL , 2) then it has an inverse if and only if A = ad — be = ± 1, 
and an inverse is as in case (/). 


4-2-18 / ANSWERS 


2. The subgroups are: (7) 6Z; (//) 12Z; (///) 30Z; (vi) 15Z. 

3. There are no other nontrivial subgroups. 

4. (/), (iii): {1,4} and {1, 11} are subgroups. 

6. In {R, +} consider the set of all positive reals. 

7. Excluding the trivial subgroups: (/) none; (//) {0, 3}, {0, 2, 4}; 

(iii) none; (iv) {0, 5}, {0, 2, 4, 6, 8}; ( v ) {0, 7}, {0, 2, 4, 6, 8, 10, 12}; 

(vi) {0, 9}, {0, 6, 12}, {0, 3, 6, 9, 12, 15}, {0, 2, 4, 6, 8, 10, 12, 14, 16}. 

8 . Pursue the powers of each element. 


10 . 


(i) {(! 
m !(! 
(<•«) ((! 


Mi 

Mi 

Mi 


3 4 
3 4 

3 4 
1 4 

3 4 

4 


))' 

Mi 

?Mi 


)}■ 

M 


1 2. 3 4 

4 3 2 


?)}■ 


(iv) The six permutations that leave 4 fixed—which amounts to S 3 . 
(d) The octic group, (vi) This will fall out in Section 4-5. 

14. For /, g e Map(T, G) define f+g by (f+ g)(x) =f(x) + g(x). If G is 
multiplicative, define fg by (fg)(x) = f(x)g(x). 

16. InZ* : 2 _1 = 4 = 2 2 , 3 _1 = 5 = 3 5 ,4 _1 = 2 = 4 2 , 5 -1 = 3 = 5 s , 6 _l = 

6 = 6 1 . 

(0 Z* = {1, 3, 5, 7}, 3 _1 = 3, 5- 1 = 5, 7" 1 = 7. 

(iv) Z* 2 = {1, 5,7, 11}, 5 _1 = 5, 7 -1 = 7, 11 _1 = 11. 

17. For each element of Z* 3 , take its powers. 
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18. (0 {e , <r 2 } is the center. (///) If # (X) > 3 then (e) is the center of S x . 

19. Yes; not if Y is infinite. 

21. Left cosets: H 3 — {e 9 t 3 }, g 1 H 3 = {o^, t 2 }, g 2 H 3 = {(72 ? ?i}* 

Right cosets: // 3 = {e, t 3 }, H 3 g x = {cq, tJ, // 3 cr 2 = {<r 2 , t 2 }. 

22. (0 {0, 4, 8}, {1, 5, 9}, {2, 6, 10}, {3, 7, 11}; (») {0, 3, 6, 9}, {1, 4, 7, 10}, 
{2, 5, 8,11}. 

23. («) {1, 3, 9}, {2, 6, 5}, {4, 12, 10}, {7, 8, 11}; (iii) {1, 3, 4, 9, 10, 12}, 

{2, 6, 8, 5, 7,11}. 

24. (/) Left cosets: H = {e 9 <r 2 }, G t H = {o^, <j 3 }; t t H = {t 1? t 2 }, p t H — 

{Pl> P2}‘ 

Right cosets: if, //(Ti = { 01 , cr 3 }, HT t — {t 1? t 2 }, Hp i — {p l5 p 2 }. 

(/u) H = {e, p u p 2 , (t 2 }, g x H = {<r u t 2 , t 1? (j 3 } = Hg y . 

26. pH Y = {t e S x \xy — py for all y e 7}. 

28. The meanings are the same. 

30. It remains valid—that is, at least one of #(H ), ( G: H) must be infinite. 

32. They are “essentially” the same. 

33. (0 The group has 10 elements so, according to Lagrange, we search only 
for nontrivial subgroups of order 2 or 5. 

4-3-18 / ANSWERS 

1. (0 [2] = {0, 2, 4, 6, 8}; («) [4] = {0, 2, 4, 6, 8}; 

(iii) [6] = {0, 2, 4, 6, 8}; (iv) [7] = Z 10 ; (v) [8] = {0, 2, 4, 6, 8}. 

2. (0 [3] ={3,9, 13, 11, 5, 1} = Z* 4 ; (ii) [5] = {5, 11, 13, 9, 3,1} = Z* 4 ; 

(iii) [9] = {9,11,1}. 

3 . [ 1 ] = [2] = [4] = [7] = [8] = [ 11 ] = [13] = [14] = Z 15 , 

[3] = {0, 3, 6, 9, 12} = [6] = [9] = [12], [5] = {5, 10, 0} = [10]. 

4. [1] = {1}; [2] = {2, 4, 8, 1}; [4] = {4, 1}; [7] = {7, 4, 13, 1}. 

[8] = {8, 4, 2, 1}; [11] = {11, 1}; [13] = {13, 4, 7, 1}; [14] = {14, 1}. 

There are no generators of Z* 5 . 

6. None. 

7. (i)-(iii) are cyclic with generators 5, 2, 2, respectively. (Other generators 
are possible.) 

8. 00 for (i)-(v); (1 + 0/\/2 has order 8 in {C*, •}. 

9. [e] = {e}, [<7j = {g u g 2 ,g 3 , e }, [<j 2 ] = {g 2 , e} 9 [g 3 ] = {g 3 , g 2 , g u e}, 

[tJ = {tjl, e}, [t 2 ] = {t 2 , e}, [pj = {p u e}, [p 2 ] = {p 2 , e). The octic group 

is not cyclic. 

10. Yes. 

14. Use the fact that there is a linear combination of m and r equal to 1. 
16. (iii) Take 


/I 2 3 4\ 

a== [i 3 4 1 y 
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then [a] = {a, a 2 , a 3 , e}. Take 


T = 


2 3 4\ 

1 3 4/’ 



2 3 4\ 
2 4 3/’ 


and obtain the four-element group {t, p, tp, e). 
18. Look at 


a = 


/I 2 3 
\2 3 4 


m — 1 m\ 
m 1 /' 


19. Look at the “ pairs ” {a, a l }. 

20. 1,5,7, 11, 13, 17, 19,23. 

25. (i) [m,n] Z; (///) [mZ u hZ] = (m, h)Z. 

28. (//) No. 30. (;) Z; (/;) 3Z; (hi) Z; (to) 3Z. 

31. (0 In (Z 19 , +}, [S] = Z 19 for S = {3, 4}, {3, 6}, {3, 4, 6}, {9, 12}. 

32. (0 Z* 3 ; (ii) {1,3, 4, 9, 10, 12}; (i») Z? 3 ; (ip) Z* 3 . 


/I 2 
a = \2 1 

_/l 2 
<T_ \2 1 


2 3 
3 


37. Use 4-3-11. 


2 

2 3 


2 3 
3 


) and T = (2 

) and ^ = (J 


4 5 
4 5 


3 

4 

2 

3 


t) 


generate S 4 ; 


451) generate Ss ■ 


4-4-15 / ANSWERS 

1. (/), (un), and (urn) are normal subgroups. 

2. G is the octic group, H — { e , t 1? t 2 , cr 2 }, = {e, crj. 

3. #(G) = #(N) • #(G/N). 5. Find t for which xH * Hx . 

9. H — { e , tJ, # = {e, t 2 } (notation of 4-1-9). 

15. Image = {1, — 1, /, —/}, kernel = 4Z. 

16. This is essentially the same as Problem 14 because every homomorphic 
image of G is of form G/N. 

17. For b e G look at bab ~ l . 

18. Fix an x 0 e {1, 2,..., n + 1} and consider all o e S n+i9 which keep x 0 
fixed. For each x 0 , we have an injective homomorphism of S n ^S n + l . 

20. (0 Image is all positive reals, kernel is the unit circle which is 
{z e C | \z\ =1}; (ii) C*/W is isomorphic to the multiplicative group of 
positive reals. 

21. With regard to converse, take G = S 3 , N = {e, a l9 or 2 }. 

26. (0 Yes: </>( 1) = 2, ft 2) = 4, <t>{ 3) = 8, <t>{ 4) = 5, ^(5) = 10, c/>(6) = 9, 
<£(7) = 7, 0(8) = 3, 0(9) = 6, 0(0) = 1; (ii) No. 

27. An isomorphism must map a generator of {Z p _ 1? +} to a generator of 

{Z p *, •}• 
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30. Make use of Problem 28. 

31. Map a generator of G to any element of G ; using 28, there are n endo- 
morphisms. 

4-5-19 / ANSWERS 

1. (/) (1275348); (ff) (17624583); (Hi) (1426)(3957); (iv) (16982354). 

2. (0 /1 2 3 4 5 6 7 8) (iff) /I 2345678 9) 

V 4 3 6 5 8 7 2)' U 3 1 9 6 8 5 7 2)' 

3. (0 <t 2 = (358)(749); (ff) p 3 = (2549)(37); 

(Hi) t _1 = (63951)(27)(48); (iv) to = (1734256)(89); 

(v) orp = (1763)(28)(59); (vi) ot~ 1 P = (17423)(59); 

(vii) o x = (649)(75)(238); (riff) (o 1 )" = (154)(32)(976). 

4. 7, 8, 4, 8; ord o = 6, ord r = 10, ord o 2 = 3, ord p~ x = 12, 
ord(<rr _l p) = 10, ord(<T r ) = 6, ord(T") p = 10. 

5. Orbits of o 2 are subsets (possibly equal) of orbits of o. 

6. One choice for t is (197683)(452). 

7. (0 Odd: (12), (13), (23). Even: e, (123), (132); (12), (13), (23) are of order 
2 and constitute a conjugate class; (123), (132) are of order 3 and consti¬ 
tute a conjugate class; e is a conjugate class. 

8 . (0 7 conjugate classes with representatives: e, (12), (123), (1234), (12345), 
(12)(34), (123)(45). 

9. Use 4-5-15. 10. Conjugate permutations have the same order. 

11. (i) (12)(34567). 

13. (/) Two-cycles (a, b) are unaffected, three-cycles (abc) are written as 
(ac)(ab), four-cycles (abed) = (ad)(ac)(ab), those of form (ab)(cd) are 
unaffected. 

15. Not really. 

17. (i) S 3 ; (ii) S 3 ; (iff) S 4 ; (iv) A 4 ; (viii) A 4 . 

19. First compute o T . 20. Make use of 4-5-17. 

22. (i) The only possibilities are subgroups of order 5 (of which there is 
exactly 1) and subgroups of order 2 (of which there are 5). 

24. The tetrahedral group is isomorphic to A 4 . There are 12 symmetries. 


4-6-17 / ANSWERS 

1. (i) 2 belongs to 3 (mod 7); (ii) 3 belongs to 5 (mod 11); 

(iff) 7 belongs to 4 (mod 20); (iv) 5 belongs to 6 (mod 21). 

2. (0 4; (iff) 10. 6. 8 8 = 1 (mod 9). 

7. (ii) Use the isomorphism Z* x Z* « Z*„. 

8 . (i) [10, 12, 16] = 240; (ff) 12. 
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9. (/) 4; ( ii ) 2; (w) 6; (n?) 4. 13. Use Problem 11. 

14. (0 6; (m) 7 4 • 6; (mi) 60; (i>) [2, 17 2 • 16, 36, 66]. 

16. If p = 3 (mod 4), then —^ is not a primitive root (mod /?). 

17. This generalizes Problem 13. 

18. (0 m = p = 17, g = 5 

a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 

ind,<i 16 6 13 12 1 3 15 2 10 7 11 9 4 5 14 8 

The congruences have solutions: x = 5 (mod 17), none, 
v ee 9 (mod 16). 

19. (iv) m = p = 19, g = 2 

a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 

ind g a 18 1 13 2 16 14 6 3 8 17 12 15 5 7 11 4 10 9 

The primitive roots (mod 19) are: 2, 13, 14, 15, 3, 10. 

(v) m = p = 23, g = 5 

a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 

ind^fl 22 2 16 4 1 18 19 6 10 3 9 20 14 21 17 8 7 12 15 5 13 11 

The primitive roots (mod 23) are: 5, 10, 20, 17, 11, 21, 19, 15, 7, 14. 

(vi) m = p = 29, g = 2 

a 1234 56 78 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 

ind* a 28 1 5 2 22 6 12 3 10 23 25 7 18 13 27 4 21 11 9 24 17 26 20 8 16 19 15 14 

The primitive roots (mod 29) are: 2, 8, 3, 19, 18, 14, 27, 21, 26 ,10, 
11, 15. 

20. m — 18, g — 5 

a 1 5 7 11 13 17 

ind* a 6 1 2 5 4 3 

The primitive roots (mod 18) are 5, 11. 

21. For m = 19, the only solution is x = 15 (mod 19). 

22. (/) For m = 18, there is no solution; nor is there one for m = 23. 

23. First, solve modulo each prime. 24. No solution. 

25. (/) a = 5, b = 3; (ii) a = 13, b — 3. 29. Use Problem 12. 

30. This is really a familiar statement about the order of an element in a 
cyclic group. 

36. Use 4-6-16. 
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Conjugate class, 477 
Coset, 412 
Cycle, 463 
Cycle structure, 475 
Cyclic group, 421 
Cyclotomic polynomial, 371 

D 

Dedekind, 511 
Degree of polynomial, 261 
De Morgen’s laws, 130 
Derivative, 273, 328 
Derived group, 456 
Determinant, 510 
Diagonal, 120 
Diagonal matrix, 207 
Dihedral group, 486 
Diophantine equation, linear, 42 
Direct product, 119, 138, 400, 492 
Direct sum, 119, 138, 205, 492 
Dirichlet’s theorem, 24 
Discriminant, 314, 479 
Disjoint cycles, 463, 464 
Distributive laws, 91 
generalized, 163 
Division algorithm, 6, 278 
Division ring, 376 
Divisor, 2, 230 
common, 13, 283 
greatest common, 13, 38, 283 
Domain, 101 
Double coset, 514 

E 

Eisenstein, 351 
Eisenstein’s criterion, 317 
Empty set, 125 


Endomorphism, 206, 459 
Epimorphism, 445 
Equivalence class, 276 
Equivalence relation, 276 
Eratosthenes, sieve of, 26 
Euclid, 23 

Euclidean algorithm, 11, 282 
Euclidean domain, 516, 517 
Euler, 345 

Euler 0-function, 240 
Euler’s method, 46 
Euler’s theorem, 249 
Evaluation function, 273 
Even, 10 

Even permutation, 481 
Even position, 81 
Expansion, 72, 370 
Exponent, 490, 491 
Exponential function, 450 
External direct sum, 492 
External direct product, 492 

F 

Factor, 2, 230 
Factor group, 439, 443 
Factor theorem, 269, 295 
Factorial, 168 

Factorization, unique, 25, 288 
Fermat prime, 88 
Fermat’s theorem, 171, 249, 267 
Fibonacci sequence, 152 
Field, 230, 237 
of quotients, 368 
Four group, 398 
Function, 175 

Fundamental theorem of algebra, 310 

G 

Gauss’ lemma, 318, 346 
Gaussian integers, 110 
Generalized commutative-associative 
law, 161 

Generated by, 425 
Generator, 421 
Good move, 81 
Graph, 269 
Greater than, 140 

Greatest common divisor (gcd), 13, 38, 
283, 517 
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Greatest integer function, 348, 373 
Group, 381 

of inner automorphisms, 455 
of permutations, 393 
of square, 392 
of units, 387 


H 

Hexahedral group, 513 
Homomorphism, 182, 190, 439, 445 


I 

Ideal, 515 
Idempotent, 103 
Identity, 91, 381 
Image, 175, 176 
Indeterminate, 253 
Index, 500 

of subgroup, 416 
Induction, 147 
Inductive set, 147 
Infinitude of primes, 24 
Injection, 185, 198 
Injective map, 176, 445 
Inner automorphism, 455 
Integers, 1 

Integral domain, 101 
Integral point, 42 
Internal direct product, 518 
Internal direct sum, 205 
Interpolation, 300 
Intersection of sets, 125 
Irrational, 309, 374 
Irreducible, 287, 517 
Inverse, 91, 93, 202, 381 
Isomorphic, 191 
Isomorphism, 182, 190, 445 
Isomorphism theorem, 452, 459 


J 

Jacobi symbol, 359 


K 

Kernel of homomorphism, 189, 446 
Klein’s four group, 398 


L 

Lagrange interpolation, 300, 372 
Lagrange’s theorem, 416 
Lattice point, 352 
Laws of exponents, 165, 235, 384 
Leading coefficient, 254 
Leading term, 254 

Least common multiple (1cm), 36, 294 

Left coset, 412 

Left ideal, 516 

Left inverse, 202 

Legendre symbol, 343 

Length of cycle, 463 

Less than, 139 

Linear combination, 4 

Linear congruence, 212 

Linear diophantine equation, 42 

Linear equation, 211 

Logarithm, 449 

M 

Mapping, 175 

Mathematical induction, 147 
Matrix, 115 
Maximum, 143, 158 
Mersenne primes, 87 
Minimum, 158 
Mobius function, 208 
Mobius inversion formula, 209 
Modular law, 511 
Monic polynomial, 254 
Monomorphism, 445 
Multiple, 2, 230 
common, 36 
least common, 36, 294 
Multiple root, 366 
Multiplication, 91 
Multiplicative function, 208, 242 
Multiplicative property, 140 
Multiplicity, 366 

N 

Negative element, 138 
Nilpotent element, 173 
Nim, 80 

Nonsingular, 511 
Nontrivial orbit, 464 
Norm, 376 
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Normal subgroup, 439, 443 
Normalizer, 511 
Number line, 7, 104 


O 

Octahedral group, 513 

Octic group, 392 

Odd integer, 10 

Odd permutation, 481 

Odd position, 81 

One-one correspondence, 180 

One-one mapping, 176 

Onto map, 176 

Operation, 91 

Orbit, 464, 514 

Order, 191 

of element, 426 
of group, 416 
Order-isomorphism, 193 
Ordered domain, 138 
Outer automorphism, 455 


P 

Partition, 277 
Permutation, 393 
Permutation group, 460 
Pigeon-hole principle, 403 
Place value system, 72 
Plus, 91 

Polynomial, 254 
Polynomial function, 266 
Polynomial ring, 254, 367, 480 
Positional notation, 72 
Positive elements, 138 
Power residue, 505 
Powers of element, 165 
Preimage, 513 
Prime, 21, 287, 377, 517 
Prime factorization, 25, 288 
Primes, infinitude of, 24 
Primitive polynomial, 316 
Primitive root, 490 
Principal ideal, 515 
Principal ideal domain, 517 
Product, 91, 516 
Projection, 185, 198, 494 
Proper ideal, 515 
Proper subgroup, 434 


Q 

Quadratic formula, 312, 342 
Quadratic nonresidue, 342 
Quadratic reciprocity, 340, 353 
Quadratic residue, 342 
Quaternions, 135 
Quotient, 10, 279 
Quotient field, 368 
Quotient group, 443 

R 

r-cycle, 463, 464 
Radix representation, 64, 67 
Rational form, 370 
Rational function, 370 
Rational numbers, 104 
Real numbers, 104 
Real quaternions, 207 
Reduced residue system, 249 
Reducible polynomial, 288 
Reflexive property, 51, 276 
Regular hexahedron, 513 
Regular octahedron, 513 
Relation, 276 

Relatively prime, 19, 38, 240, 286, 517 

Remainder, 10, 279 

Remainder theorem, 294 

Representation, 67 

Residue, 342 

Residue class, 55 

Residue class map, 183 

Right coset, 412 

Right ideal, 516 

Right inverse, 201 

Rigid motion, 388 

Ring, 91 

of functions, 122 
of matrices, 117 
of sets, 127 
with unity, 97 
Root, 269, 366 

S 

Scalar matrices, 207 
Semigroup, 381 
Sgn, 481 

Sieve of Eratosthenes, 26 
Simple group, 515 
Simple ring, 519 
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Simple root, 366 
Smaller than, 140 

Solution, 42, 211, 212, 269, 322, 335 
Strictly triangular matrices, 207 
Subdomain, 133 
Subgroup, 401 
Subring, 111 
Subset, 174 
Substitution, 265 
Sum, 91, 516 
of subrings, 133 
Sun-Tse, 229 
Surjective map, 176, 445 
Symmetric difference, 126 
Symmetric group, 393 
Symmetric property, 51, 276 
Symmetry, 389 

T 

Taylor’s theorem, 340 
Term of polynomial, 253 
Tetrahedral group, 490 
Times, 91 
Totient, 240 
Transcendental, 253 
Transitive property, 51, 140, 276 
Transpose of matrix, 135 
Transposition, 463 
Triangular matrices, 207 
Trichotomy law, 140 


Trivial homomorphism, 183 
Trivial subgroup, 403 

U 

Undetermined coefficients, 300 
Union of sets, 125 
Unique factorization, 25, 288 
Unique factorization domain, 517 
Unique mod m, 221 
Unit, 230, 234 
Unit circle, 388, 407 
Unity, 97 

Universal exponent, 498 

y 

Value, 175 

W 

Well defined, 60 

Well ordered domain, 145 

Wilson’s theorem, 304 

Z 

Zero, 93 

of polynomial, 269 
Zero divisor, 101 
Zero element, 93 
Zero map, 183 


4 

B 5 
C 6 
D 7 
E 8 
F 9 
G 0 
H 1 
I 2 
J 3 



